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In 2007 we introduced a general model of sparse random graphs with 

(conditional) independence between the edges. The aim of this paper is 
P^ I to present an extension of this model in which the edges are far from inde- 

CLj ' pendent, and to prove several results about this extension. The basic idea 

is to construct the random graph by adding not only edges but also other 
small graphs. In other words, we first construct an inhomogeneous ran- 
Cu ■ dom hypergraph with (conditionally) independent hyperedges, and then 

replace each hyperedge by a (perhaps complete) graph. Although flexi- 
ble enough to produce graphs with significant dependence between edges, 
this model is nonetheless mathematically tractable. Indeed, we find the 
critical point where a giant component emerges in full generality, in terms 
of the norm of a certain integral operator, and relate the size of the giant 
component to the survival probability of a certain (non-Poisson) multi- 
(^— s [ type branching process. While our main focus is the phase transition, we 

^vj ■ also study the degree distribution and the numbers of small subgraphs. 

We illustrate the model with a simple special case that produces graphs 
with power-law degree sequences with a wide range of degree exponents 
and clustering coefficients. 
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1 Introduction and results 



In [To], a very general model for sparse random graphs was introduced, corre- 
ct ' sponding to an inhomogeneous version of G(n, c/n), and many properties of this 

model were determined, in particular, the critical point of the phase transition 
where the giant component emerges. Part of the motivation was to unify many 
of the new random graph models introduced as approximations to real-world 
networks. Indeed, the model of [10] includes many of these models as exact 
special cases, as well as the 'mean-field' simplified versions of many of the more 
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complicated models. (The original forms are frequently too complex for rigorous 
mathematical analysis, so such mean-field versions are often studied instead.) 
Unfortunately, there are many models with key features that are not captured 
by their mean-field versions, and hence not by the model of [TD]. The main 
problem is that many real-world networks exhibit clustering: for example, while 
there are n vertices and only 5n edges, there may be lOn triangles, say. In con- 
trast, the model of [10], like G{n,c/n), produces graphs that contain essentially 
no triangles or short cycles. 

Most models introduced to approximate particular real-world networks turn 
out to be mathematically intractable, due to the dependence between edges. 
Nevertheless, many such models have been studied; as this is not our main focus, 
let us just list a few examples of early work in this field. One of the starting 
points in this area was the (homogeneous) 'small-world' model of Watts and 
Strogatz [37] . Another was the observation of power-law degree sequences in 
various networks by Faloutsos, Faloutsos and Faloutsos [27], among others. Of 
the new inhomogcneous models, perhaps the most studied is the 'growth with 
preferential attachment' model introduced in an imprecise form by Barabasi and 
Albert [5], later made precise as the 'LCD model' by BoUobas and Riordan [TS] . 
Another is the 'copying' model of Kumar, Raghavan, Rajagopalan, Sivakumar, 
Tomkins and Upfal [33], generalized by Cooper and Frieze [53], among others. 
For (early) surveys of work in this field see, for example, Barabasi and Albert [T], 
Dorogovtsev and Mendes [5S], or BoUobas and Riordan [13] . 

Roughly speaking, any sparse model with clustering must include significant 
dependence between edges, so one might expect it to be impossible to construct 
a general model of this type that is still mathematically tractable. However, it 
turns out that one can do this. The model that we shall define is essentially a 
generalization of that in [TU] , although we shall handle certain technicalities in 
a different way here. 

Throughout this paper we use standard graph theoretic notation as in [8]. 
For example, if G is a graph then V{G) denotes its vertex set, E{G) its edge set, 
\G\ the number of vertices, and e(G) the number of edges. We also use standard 
notation for probabilistic asymptotics as in [31j : a sequence £„ of events holds 
with high probability, or whp, if P(f„) ^ 1 as ?i — > oo. If (Xn) is a sequence of 
random variables and / is a deterministic function, then Xn = Op(/(n)) means 
Xn/f{n) -^ 0, where — *■ denotes convergence in probability. 

1.1 The model 

Let us set the scene for our model. By a type space we simply mean a probability 
space (5,/i). Often, we shall take S = [0, 1] or (0, 1] with fj. Lebesgue measure. 
Sometimes we consider S finite. As will become clear, any model with S finite 
can be realized as a model with type space [0, 1], but sometimes the notation 
will be simpler with S finite. More generally, as shown in [29| . every instance 
of the random graph model we are going to describe can be realized as an 
equivalent model with type space [0, 1]. Hence, when it comes to proofs, we lose 
no generality by taking S = [0, 1], but we usually prefer allowing an arbitrary 



type space, which is more flexible for applications. For example, as with the 
model in [10], type spaces such as S = [0, 1]^ are likely to be useful for geometric 
applications, as in [TTj. 

Let J- consist of one representative of each isomorphism class of finite con- 
nected graphs, chosen so that ii F £ T has r vertices then V{F) = [r] = 
{1, 2, . . . , r}. Given F E T with r vertices, let Kp be a measurable function from 
iS'' to [0, oo); we call kf the kernel corresponding to F. A sequence k = {kf)f^j^ 
is a kernel family. In our results we shall impose an additional integrability con- 
dition on K, but this is not needed to define the model. 

Let K be a kernel family and n an integer; we shall define a random graph 
G{n,K) with vertex set [n] = {l,2,...,n}. First let a;i,a;2, . . . ,x„ G 5 be 
i.i.d. (independent and identically distributed) with the distribution fj,. Given 
X = {xi, . . . , Xn), construct G{n,K) as follows, starting with the empty graph. 
For each r and each F € T with |F| = r, and for every r-tuple of distinct 
vertices (wi, . . . ,iv) e [nY , add a copy of F on the vertices vi,. . . ,Vr (with 
vertex i oi F mapped to vi) with probability 

p=p{vi,...,Vr;F)^ —j , (1) 

all these choices being independent. If p > 1, then we simply add a copy 
with probability 1. Wc shall often call the added copies of the various F that 
together form G{n,K) atoms since, in our construction of G{n,K), they may 
be viewed as indivisible building blocks. Sometimes wc refer to them as small 
graphs^ although there is in general no bound on their sizes. Usually we think of 
G{n, K.) as a simple graph, in which case we simply replace any multiple edges 
by single edges. Typically there will be very few multiple edges, so this makes 
little difference. 

Note that we assume that the atoms of G{n, k) are connected. The extension 
to the case where some atoms may be disconnected is discussed in Section [5l 

The reason for dividing by n'"~^ in ([T]) is that we wish to consider sparse 
graphs; indeed, our main interest is the case when G{n,K) has 0{n) edges. As 
it turns out, wc can be slightly more general; however, when up is intcgrable 
(which wc shall always assume), the expected number of added copies of each 
graph F is 0{n). Note that all incompletely specified integrals are with respect 
to the appropriate r-fold product measure fi^ on S^ . 

Remark 1.1. There are several plausible choices for the normalization in ([T|). 
The one wc have chosen means that if Kp = c is constant, then (asymptotically) 
there are on average en copies of F in total, and each vertex is on average in re 
copies of F. An alternative is to divide the expression in ([1]) by r; then (asymp- 
totically) each vertex would on average be in c copies of F. Another alternative, 
natural when adding cliques only but less so in the general case, would be to di- 
vide by r!; this is equivalent to considering unordered sets of r vertices instead of 
ordered r-tuples. When there is only one kernel, corresponding to adding edges, 
this would correspond to the normalization used in [lOj, and in particular to 
that of the classical model G{n, c/n); the normalization wc use here differs from 



this by a factor of 2. Yet another normalization would be to divide by aut(i^), 
the number of automorphisms of F; this is equivalent to considering the distinct 
copies of F in Km which is natural but leads to extra factors aut(i^) in many 
formulae, and wc do not find that the advantages outweigh the disadvantages. 

As in [lOj . there are several minor variants of G{n,K); perhaps the most 
important is the Poisson multi-graph version of G(n,K). In this variant, for 
each F and each r-tuple, we add a Poisson Po(p) number of copies of F with 
this vertex set, where p is given by ([T]), and we keep multiple edges. 

Alternatively, we could add a Poisson number of copies and delete multiple 
edges, which is the same as adding one copy with probability 1 — e^^ and no 
copy otherwise. More generally, wc could add one copy of F with probability 
p + o(p), and two or more copies with probability o(p). As long as the error 
terms are uniform over graphs F and r-tuplcs (wi, . . . ,fr), all our results will 
apply in this greater generality. Since this will follow by simple sandwiching 
arguments (after reducing to the 'bounded' case; see Definition 12. 9p . we shall 
consider whichever form of the model is most convenient; usually this turns out 
to be the Poisson multi-graph form. 

Remark 1.2. Under certain mild conditions, the results of [30] imply a strong 
form of asymptotic equivalence between the various versions of the model. For 
example, if wc add copies of F with probability p + 0{p^), where the implied 
constant is uniform over F and (vi, . . . , Vr), and 

^Y. J2 pivu...,v\F\;Ff^o{l), (2) 

F 'Ui,...,t)|j? 

then the resulting model is equivalent to that with probability p, in that the 
two random graphs can be coupled to agree whp; this is a straightforward 
modification of [301 Corollary 2.13(i)]. Extending the argument in [3D1 Example 
3.2], it can be shown that ([2|) holds if 
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Kp"^' ' < OO. 



This certainly holds for the bounded kernel families (see Definition l2.9p that we 
consider in most of our proofs, although ([2]) is easy to verify directly for such 
kernel families. 

In the special case where all kf arc zero apart from k_r-2, the kernel corre- 
sponding to an edge, we recover (essentially) a special case of the model of [TO] : 
we call this the edge-only case, since we add only edges, not larger graphs. We 
write K2 for kk2- Note that in the edge-only case, given x, two vertices i and j 
are joined with probability 



The correction term will never matter, so we may as well replace K2 by its 
symmetrized version. In fact, we shall always assume that kf is invariant under 
the action of the automorphism group Aut(i^) of the graph F. In other words, 
if (j) : [r] -^ [r] is a permutation such that (j){i)(j){j) G E{F) if and only if 
ij € E{F), then we assume that Kp{(j){xi), . . . , ipixr)) = Kpixi, . . . , Xr) for all 
Xi, . . . ,Xr E S. In the Poisson version, or if we add copies of graphs F with 
probability 1 — e^^, the correction terms in ([3]) and its generalizations disappear: 
in the edge-only case, given x, vertices i and j are joined with probability 
1 — exp(— (K2(xi, Sj) + K2{xj,Xi))/n^, and in general we obtain exactly the 
same random graph if we symmetrize each kf with respect to Aut(i^). 

For any kernel family k, let Kq be the corresponding edge kernel, defined by 

i^c{x,y)=Y] y2 / KF{xi,...,x,-i,x,x^+i,...,Xj^i,y,Xj+i,...,xipi), 

(4) 
where the second sum runs over all 2e{F) ordered pairs (i,j) with ij G E{F), and 
we integrate over all variables apart from x and y. Note that the sum need not 
always converge; since every term is positive this causes no problems: we simply 
allow Ko(x, y) = 00 for some x, y. Given Xi and Xj, the probability that i and j 
are joined in G{n,K) is at most Ka{xi,Xj)/n, and this upper bound is typically 
quite sharp. For example, if k is bounded in the sense of Definition 12.91 below, 
then the probability is Kc{xi,Xj)/n + 0(1/tt?). In other words, k^ captures the 
edge probabilities in G{n,K), but not the correlations. 

Before proceeding to deeper properties, let us note that the expected num- 
ber of added copies of F is (1-1- 0{n^^))n J^^p^ kf- Unsurprisingly, the actual 
number turns out to be concentrated about this mean. Let 

C(js) = y] e{F) / f^F = t; Ko < 00 

be the asymptotic edge density of k. Since every copy of F contributes e(F) 
edges, the following theorem is almost obvious, provided we can ignore overlap- 
ping edges. A formal proof will be given in Section [T] (A similar result for the 
total number of atoms is given in Lemma 19.41 ) 



Theorem 1.3. As n -^ 00, e{G{n, K))/n converges in probability to the asymp- 
totic edge density ^{k). In other words, if S,{k) < 00 then e{G{n, k)) = S,{i^ri + 
o-pin), and if ^{k) = 00 then, for every constant G, we have e{G{n, n)) > Gn 
whp. Moreover, Ee(G(7i, k))/?i —* ^(k) < 00 

As in |10| . our main focus will be the emergence of the giant component. 
By the component structure of a graph G, we mean the set of vertex sets of 
its components, i.e., the structure encoding only which vertices are in the same 
component, not the internal structure of the components themselves. When 
studying the component structure of G{n, k), the model can be simplified some- 
what. Recalling that the atoms F E !F are connected by definition, when we 
add an atom _F to a graph G, the effect on the component structure is simply to 



unite all components of G that meet the vertex set of F, so only the vertex set 
of -F matters, not its graph structure. We say that k is a clique kernel family 
if the only non-zero kernels are those corresponding to complete graphs; the 
corresponding random graph model G(n, k) is a clique model. For questions 
concerning component structure, it suffices to study clique models. For clique 
kernels we write Kr for hk^'i ^^ above, we always assume that Kr is symmetric, 
here meaning invariant under all permutations of the coordinates of S^ . Given 
a general kernel family k, the corresponding (symmetrized) clique kernel family 
is given by k = (Kr)r>2 with 

Kr{xi,...,Xr) = ^ — ^ Ki^(x^(i),...,a:„(r)), (5) 

F^r:\F\=r ' TreSr 

where &r denotes the symmetric group of permutations of [r\. (This is consistent 
with our notation K2 = i^K2-) In the Poisson version, with or without merging 
of parallel edges, the probability of adding some connected graph F on a given 
set of r vertices is exactly the same in G{n, k) and G{n, k), so there is a natural 
coupling of these random graphs in which they have exactly the same component 
structure. In the non-Poisson version, the probabilities are not quite the same, 
but close enough for our results to transfer from one to the other. Thus, when 
considering the size (meaning number of vertices) of the giant component in 
G(n, k), we may always replace k by the corresponding clique kernel family. 

It is often convenient to think of a clique model as a random hypcrgraph, 
with the cliques as the hyperedges; for this reason we call a clique kernel family 
a hyperkernel. Note that each unordered set of r vertices corresponds to r! 
r-tuples, so the probability that we add a Kr on a given set of r vertices is 
r\Kr{xv-^^, . . . ,Xv^)/n'^'^^. (More precisely, this is the expected number of KrS 
added with this vertex set.) 

1.2 A branching process 

Associated to each hyperkernel k = {Kr)r>2^ there is a branching process X„ 
with type space S, defined as follows. We start with generation consisting 
of a single particle whose type is chosen randomly from S according to the 
distribution fi. A particle P of type x gives rise to children in the next generation 
according to a two-step process: first, for each r > 2, construct a Poisson process 
Zr on S^^^ with intensity 

rKr{x,X2, ■ ■ ■ ,Xr)dfl{x2) ' ' ' d^l{Xr). (6) 

We call the points oi Z ~ Ur>2 ^'' ^I^g child cliques of P. There are r — 1 children 
of P for each child clique {x2, ■ ■ ■ ,Xr) & S^^^ , one each of types X2, ■ ■ ■ ,Xr- Thus 
the types of the children of P form a multiset on S, with a certain compound 
Poisson distribution wc have just described. As usual, the children of different 
particles are independent of each other, and of the history. 



Considering the relationship to the graph G(n,K), the initial factor r in ([H]) 
arises because a particular vertex v may be any one of the r vertices in an 
r-tuple {vi,. . . ,Vr) on which we add a K^- 

We also consider the branching processes ^^(a;), a; € 5, defined exactly as 
Xk, except that we start with a single particle of the given type x. 

1.3 Two integral operators 

Wc shall consider two integral operators naturally associated to X^- Given any 
(measurable) f : S ^ [0, 1], define 5'k(/) by 



= $Z/ rKr{x,X2,X3,...,Xr) il^Ylil- fix,))] dfl{x2)--- dn{Xr), (7) 

r=2-^'5'^"^ \ i=2 J 

and let 



(The factors r in ([7]) and in the definition of Xk are unfortunate consequences 
of our choice of normalization.) 

Let P be a particle of Xk in generation t with type x, and suppose that 
each particle in generation t + 1 of type y has some property Q with probability 
/(y), independently of the other particles. Given a child clique {x2, . . . ,Xr) of 
P, the bracket in the definition of 5"^ expresses the probability that one or more 
of the r— 1 corresponding child particles has property Q. Hence SK.{f){x) is the 
expected number of child cliques containing a particle with property Q, and, 
from the Poisson distribution of the child cliques, ^K.{f){x) is the probability 
that there is at least one such clique, i.e., the probability that at least one child 
of P has property Q. 

Let p{k) denote the survival probability of the branching process X^, and 
Pk{x) the survival probability of Xk{x). Assuming for the moment that the 
function p,^ : S ^ [0, 1] is measurable, from the comments above and the inde- 
pendence built into the definition of X^, we see that the function p^ satisfies 

Pk = *k(/5k). 

Using simple standard arguments as in [10| . for example, it is easy to check 
that Pk is given by the maximum solution to this equation, i.e., the pointwise 
supremum of all solutions f : S ^ [0, 1] to 

/ = l-e-^=^(/); (8) 

see Lemma 12.11 below. From the definitions of X^ and Xk{x), it is immediate 
that 

pin) = / PK{x)dp{x). 
Js 



In our analysis we shall also consider the linear operator T„^ defined by 



7^K„(/)W- / n,{x,v)f{y)d^l{y), (9) 

JS 

where Kc is defined by ([?]). For a hyper kernel k (which is the only type of kernel 
family for which we define the branching process), we have 

..A..,)^Y.ri'-^)l M..»,x....„...)*fe)-^^d.W, (10) 

r>2 -^^ 

from which it is easy to check that T^^ is the linearized form of S^: more 

precisely, T„^ is obtained by replacing I - Y[l=2i^ ^ f i^^)) by 21=2 /(^^O in 
the definition (O of 6*5. 

Let us note two simple consequences of this fact. For any sequence {yi)i in 
[0, 1] we have 1 - JJ^il - y,) < J2r y^^ so 

0<SM)<T.Af) (11) 

for any f : S ^ [0, 1]. Also, 1 - l\^{l - yi) > if and only if J2i Vi > 0- Since 
the integral of a non-negative function is positive if and only if the function is 
positive on a set of positive measure, it follows that for any / : S —^ [0,1] we 
have 

S^if)ix) > ^=^ T,,Xf)i=^) > 0- (12) 

In the edge-only case, when only K2 is non-zero, Hc = 2k2 and T^^ = S^- 
When translating results from [TU], it is sometimes T^^ and sometimes S^ that 
plays the role of the linear operator T^ appearing there. 

1.4 Main results 

In most of our results we shall need to impose some sort of integrability condition 
on our kernel family; the exact condition depends on the context. 

Definition 1.4. (i) A kernel family k = {Kp)p,zjr is integrahle if 

I ii=T^ \F\ I KF<oo. (13) 

This means that the expected number of atoms containing a given vertex is 
bounded. 

(ii) A kernel family k — {k.f)f£J^ is edge integrahle if 






Kp < 00; 



equivalently, ^(k) < 00 or Jc2 Ko < 00. This means that the expected number 
of edges in G{n,K) is 0{n), see Theorem II. 3i and thus the expected degree of a 
given vertex is bounded. 



Note that a hyperkernel (k^) is integrable if and only if ^r>2^ Is'' '^^ '^ '^' 
and edge integrable if and only if J2r>2 ^^ Is^ Kr < oo. 

Since we only consider connected atoms F, it is clear that 

edge integrable => integrable. 

Our main result is that if k is an integrable kernel family satisfying a certain 
extra assumption, then the normalized size of the giant component in G{n, k) is 
simply /9(k) + Op(l). The extra assumption is essentially that the graph does not 
split into two pieces. As in [T^, we say that a symmetric kernel Ko : 5^ — > [0, oo) 
is reducible if 

3A(l S with < fi{A) < 1 such that Ko = a.e. on Ax {S\A); 

otherwise Ko is irreducible. Thus Kq is irreducible if 

Acs and Ko = a.e. on A x (5 \ vl) implies ^i{A) = or fi{S \A)=0. 

A kernel family {KF)Fi^r or hyperkernel {Kr)r>2 is irreducible if the correspond- 
ing edge kernel Ko is irreducible. It is easy to check that a kernel family (KF)FeJ^ 
is irreducible if and only if for every A C S with < n{A) < 1 there exists an 
F & T such that, with r = \F\,\i xi, . . . ,Xr are chosen independently at random 
in S with distribution /i, then there is a positive probability that {xi\ D ^ ^ 0, 
{a;i}n(iS\A) 7^ and kf(xi, . . . ,Xr) > 0. Informally, {KF)FeJ^ is irreducible if, 
whenever we partition the type space into two non-trivial parts, edges between 
vertices with types in the two parts are possible. 

Note that a kernel family k' and the corresponding hyperkernel k do not 
have the same edge kernel: replacing each atom by a clique in general adds 
edges, so k^ < He with strict inequality possible. If Kg is irreducible, then so 
is Ke; using the characterization of irreducibility above, it is easy to check that 
the reverse implication also holds. 

We are now ready to state our main result; we write d for the number of 
vertices in the ith largest component of a graph G. 

Theorem 1.5. Let k' ~ iKp)FeJ^ be an irreducible, integrable kernel family, 
and let k = {Kr)r>2 be the corresponding hyperkernel, given by ([5]). Then 

Ci(G(n,K')) = p{K)n + Op{n), 

and C2{G{n,K')) = Op(n). 

The reducible case reduces to the irreducible one; see Remark 14.51 



Remark 1.6. Unsurprisingly, part of the proof of Theorem ll.Sl involves showing 
that (in the hyperkernel case) the branching process captures the 'local struc- 
ture' of G{n, k); see Section[3]and in particular Lemma 13.21 So Thcorcm ll.Sl can 
be seen as saying that within this broad class of models the local structure deter- 
mines the size of the giant component. Of course, the restriction is important, 
as shown by the fact that the global assumption of irreducibility is necessary. 



Of course, for Theorem 11.51 to be useful we would like to know something 
about the survival probability p(k); as noted earlier, p{k) can be calculated from 
Pk, which is in turn the largest solution to a certain functional equation ^. Of 
course, the main thing we would like to know is when p{k) is positive; as in [TU], 
it turns out that the answer depends on the L^-norm ||Tk^|| < oo of the operator 
Tk^ defined by ^. (Since this operator is symmetric, its L'^-norm is the same as 
its spectral radius. In other contexts, it may be better to work with the latter.) 

Theorem 1.7. Let k, be an integrable hyperkernel. Then p(k) > if and only 
if \\TkJ\ > 1- Furthermore, if k, is irreducible and \\TkJ\ > 1, then Pk{x) is the 
unique non-zero solution to the functional equation ([5]), and Pk{x) > holds for 
a.e. X. 

In general, IITk^H may be rather hard to calculate; a non-trivial example 
where we can calculate the norm easily is given in Subsection 18.21 Let us give 
a trivial example here: suppose that each Kr is constant, say k^ = Cr. Then 
Avo(x, y) = J2r ^('' ^ ^)^r = 2^(k) for all x and y, so 

\\T.J\^2an). (14) 

This is perhaps surprising: it tells us that for such uniform hyperkerncls, the 
critical point where a giant component emerges is determined only by the total 
number of edges added; it does not matter what size cliques they lie in, even 
though, for example, the third edge in every triangle is 'wasted'. This is not 
true for arbitrary kernel families: we must first replace each atom by a clique. 
Note that for any hyperkernel. 



\T,J>{l,T,J)=jKo = 2^{n), 



with equality if and only if 1 is an eigenfunction, i.e., if the asymptotic expected 
degrees X{x) = Jg Ke{x,y)diJ,{y) are the same (ignoring sets of measure 0); c.f. 
[TOl Proposition 3.4]. 

1.5 Relationship to the results in |10| 

In the edge-only case, the present results are almost (see below) special cases of 
those [TU]. The set-up here is much simpler, as we choose to insist that the vertex 
types Xi, . . . ,Xn are i.i.d. This avoids many of the complications arising in |10| . 
In one way, the present set-up is, even in the edge-only case, more general than 
that considered in [TOj: with the types i.i.d., there is no need to restrict the 
kernels other than to assume integrability (in |10| we needed them continuous 
a.e.), and one does not need to impose the 'graphicality' assumption of [TU]. 
Thus the edge-only case here actually complements the results in [TU] . We could 
form a common generalization, but we shall not do this in detail; we believe that 
it is just a question of combining the various technicalities here and in [10] . and 
that no interesting new difficulties arise. Of course, these technicalities are 
rather beside the point of the present paper; our interest is the extension from 
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kernels to hyperkernels. This turns out not to be as straightforward as one 
might perhaps expect. The problem is that the correlation between edges forces 
us to deal with a non-linear operator, namely S^- 

The rest of the paper is organized as follows. In the next section we prove 
the results about the non-Poisson branching process X^ that we shall need later, 
the most important of which is Theorem 1 1.71 In Section [3] we consider the local 
coupling between the graph and the branching process, showing in particular 
that the 'right' number of vertices are in components of any fixed size. In 
Section 2] we complete the proof of Theorem 11.51 showing that whp there is at 
most one 'large' component, which is then a 'giant' component of the right size. 
Wc briefly discuss percolation on the graphs G{n, k) in Section [3 In Sections [H] 
and[7]wc consider simpler properties of G{n,K), namely the asymptotic degree 
distribution and the number of subgraphs isomorphic to a given graph. Our 
results in Section [7] include Theorem 11.31 as a simple special case. In Section [8] 
we illustrate the flexibility of the model by carrying out explicit calculations for 
a special case, giving graphs with power-law degree sequences with a range of 
exponents and a range of clustering and mixing coefficients; see Section [8] for 
the definitions of these coefficients. Finally, in Section [9] we discuss connections 
between our model and various notions of graph limit, and state two open 
questions. 

2 Analysis of the branching process 

In this section, which is the heart of the paper, we forget about graphs, and 
study the (compound Poisson) branching process X^- One might expect the 
arguments of |10| to carry over mutatis mutandis to the present context, but 
in the branching process analysis this is very far from the truth; this applies 
especially to the proof of Theorem 12.41 below. 

Throughout the section we work with an integrablc hyperkerncl k = {nr)r>2, 
i.e., we assume that J !S — X^r^/'^f "^ °°- ^^^ main aim in this section is to 
prove Theorem 1 1.71 

For X G S let 

A(x) = (5„(l))(x) = V] / rKr{x, X2,X3,... , Xr) dfi{x2) ■ ■ ■ dfl{Xr), 

so A(a;) is the expected number of child cliques of a particle of type x. We have 



/..,..„.£/._/, 



which is finite by our integrability assumption (fT3|) . It follows that A(x) < cxd 
holds almost everywhere. Changing each kernel Kr on a set of measure zero, 
we may assume that X{x) is finite for all x. (Such a change is irrelevant for 
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the branching process and for the graph.) From now on, we thus assume that 
A(a;) < oo holds for all x, for any hypcrkcrnel k we consider. 

Since a Poisson random variable with finite mean is always finite, any particle 
in X„ has a finite number of child cliques, and hence a finite number of children, 
even though the expected number of children may perhaps be infinite. Hence, 
the event that the branching process dies out (i.e., that some generation is 
empty) coincides with the event that it is finite. 

Using this fact, we have the following, standard result. Recall that Pk.{x) 
denotes the survival probability of the branching process X„(x) that starts with 
a single particle of type x, and /9„ denotes the function x i— > Pk{x) . 

Lemma 2.1. The function p^ satisfies the functional equation ([5]). Further- 
more, if f : S ^ [0, 1] is any other solution to ([5]), then < f{x) < Pk{x) < 1 
holds for every x. 

Proof. Let pt{x) be the probability that Xk{x) survives for at least t generations, 
so po is identically 1. Conditioned on the set of child cliques, and hence children, 
of the root, each child of type y survives for t further generations with probability 
Pt (y) ■ These events are independent for different children by the definition of the 
branching process, so pt+i — ^K,{pt)- The result follows from the monotonicity 
of $K and the fact that pt{x) \ Pk{x), noting that $k(1)(2;) = 1 - e"^*^^) < 1 
for the strict inequality. D 

Let us remark for the last time on the measurability of the functions we 
consider: in the proof above, po is measurable by definition. From the definition 
of <I>K and the measurability of each Kfc, it follows by induction that each pt is 
measurable, and hence that p^ is. Similar arguments apply in many places later, 
but we shall omit them. 

We next turn to the uniqueness of the non-zero solution (if any) to ^. The 
key ingredient in establishing this is the following simple inequality concerning 
the non-linear operator S^- 

Lemma 2.2. Let k be an integrable hyperkernel, and let f and g be measurable 
functions on S with < f < g < 1. Then 



fS^g < / gS^^f. 
s Js 

Proof. We may write S^ as X]r>2 ^r, where Sr is the non- linear operator cor- 
responding to the single kernel Kr, so Srif) is defined by the summand in ([7]). 
It suffices to prove that 



fSr-g < / gSrf. (15) 

5 Js 

We shall in fact show that for any (distinct) xi, . . . ,Xr € S we have 

E /(^-(i))(l-ri(l-5(a;.(0))) < E 9{x.il))|l-f[{l-f{x.i^)))] 

(16) 
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Since Kr is symmetric, P^ follows. (In fact, (|15p can be true in general only if 
(fT6|) always holds, considering the symmetrization of a delta function.) Now ((16]) 
can be viewed as an inequality in 2r variables f{xi), . . . , f{xr), g(xi), . . . , g{xr)- 
This inequality is linear in each variable. Furthermore, it is linear in each pair 
{f{xi), g(xi)). In proving p6p for any < / < <7 < 1, we may thus assume that 
for each i one of three possibilities holds: = f{xi) = g{xi), f{xi) — g{xi) = 1, 
or f{xi) = and g{xi) = 1. In other words, we may assume that / and g are 
{0, l}-valued. 

Suppose then for a contradiction that ^TE\\ fails for some {0, l}-valued / and 
g with f < g. Then there must be some permutation tt such that 

fi^.ii)) ( 1 - ri(i - 5K(,))) j > ffK(i)) ( 1 - n(i - fi^.i^))) j , (17) 

which we may take without loss of generality to be the identity permutation. 
Since both sides of ^T7\ are {0, l}-valued, the left must be 1 and the right 0. 
Since the left is 1, we have f{xi) = 1, so, using f < g, g{xi) = 1. But now 
for the right hand side of p7)) to be the final product in (flT)) must be 1, so 
f{xi) = ior i — 2, . . . ,r, i.e., / takes the value 1 only once. Of course, g must 
take the value 1 at least twice, otherwise we have equality. But now the left 
hand side of (|16p is exactly (r— 1)!, coming from terms with 7r(l) = 1 and hence 
f{x-K(i)) ~ 1- The right hand side is at least (r — 1)!, from any tt mapping 1 to 
some j 7^ 1 with g{xj) = 1. Hence (|16l) holds after all, giving a contradiction 
and completing the proof. D 

If K is reducible, then ([8]) may in general have several non-zero solutions. 
To prove uniqueness in the irreducible case we need to know what irreducibility 
tells us about Sk,- 

Lemma 2.3. // there exists a measurable / : 5 ^ [0, 1] with < /i{/ > 0} < 1 
and {S^f > 0} C {/ > 0}, then k is reducible. 



Proof. Let A = {f > 0}, so by assumption S^f = on A" = S\ A. From ((T! 
we have {T^J = 0} = {S^^f = 0}, so T^J = on A". From the definition of 
Tk^ it follows that Kc = a.e. on A'^ x A, so k^ is reducible. But this is what it 
means for k to be reducible. D 

In fact, taking / to be a suitable indicator function, one can check that the 
converse of Lemma [2731 also holds. 

Using Lemmas 12.21 and 12.31 it is easy to deduce uniqueness of any non-zero 
solution to ([8|). 

Theorem 2.4. Let k be an irreducible, integrable hyperkernel, and let f and g 
be solutions to 1^ with < f{x) < g{x) < 1 for every x. Then either / = 
or f = g. In particular, the only solutions to ^ are p^ and the zero function, 
which may or may not coincide. 
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Proof. We may suppose that / is not a.e.; otherwise, / = $k(/) would be 
identically zero. Since / solves ([8]), we have {/ = 0} = {3^.1 = 0}, so by 
Lemma [2.31 we cannot have < fi{f > 0} < I. The only possibility left is that 
fi{f > 0} = 1, i.e., / > a.e. Turning to g, since k is integrable, we have 
SK,{g){x) < Sk.{1){x) = X{x) < cx) for a.e. x, and thus g ~ $5(5) < 1 a.e. 

Since / and g solve (|S]), we have SK.{f){x) = — log(l — f{x)) and SK.{g){x) = 
— log(l — g{x)). Hence, 

fS.{g) = -/log(l -g) = fig + g" 12 + g^ ji + • • • ) 

>.g(/ + /V2 + /V3 + ---)-.9^K(/) 

whenever < / < 5 < 1, with strict inequality whenever < f < g. Since 
K is integrable, it is immediate from the definition ^ that S^f and S^g are 
integrable, and it follows that 



fSng > / gS^f, 
s Js 

with strict inequality unless f = g a.e. Since Lemma 12.21 gives the reverse 
inequality, we have f = g a.e., and thus / = $k/ — ^^g — g- The second 
statement then follows from Lemma [2. II D 



Theorem l2. 41 generalizes the corresponding result in [10], namely Lemma 5.9. 
Indeed, in the edge-only case (when only K2 is non-zero), the operators S^ and 
Tk^ coincide, and Lemma 12.21 holds trivially, using the symmetry of T^^ . This 
shows that, with hindsight, the proof of Lemma 5.9 in [10] may be simplified 
considerably, by considering J^ fTg instead of /„ fTh, h = [g — /)/2. This is 
significant, since the proof in [10| does not adapt readily to the present context. 

Although simple, the proof of Theorem l2.4l above is a little mysterious from 
a branching process point of view. It is tempting to think that the result is 
'obvious', and indeed that a corresponding result should hold for any Galton- 
Watson process. However, some conditions are certainly necessary, and it is 
not clear what the right conditions are for a general process. (Irrcducibility is 
always needed, of course.) In |36| . a corresponding result is proved for a general 
branching process satisfying a certain continuity assumption; the proof uses the 
convexity property <f>(A/) > A$(/) for any function < / < 1 and any < 
A < 1, which holds for all Galton- Watson branching processes. In Thcorem l2.4l 
continuity is not needed, but some kind of symmetry is; there does not seem to 
be an obvious common generalization of these results. 

Indeed, the next example shows that the situation is not that simple; in the 
compound Poisson case (as opposed to the simple Poisson case), symmetry of 
the relevant linear operator is not enough. 

Example 2.5. Let iS = {1, 2,3,.. .} with ^{i} = 2~* for each i, and consider the 
branching process X = X(a;) with type space (S, /x) defined as follows. Start with 
a single particle of some given type x. Each particle of type i has a Poisson 
number of children of type i -f- 1 with mean 2 = 2*+^/i{i + 1}; we call these 
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'forward children'. Also, for i > 2, a particle of type i has 'backward children' 
of type i — 1: the number of these is 4*+^ times a Poisson with mean 4~'. Note 
that the expected number of backward children is 4 = 2'+^/i{i — 1}. Defining 
the 'edge-kernel' k^ so that the expected number of children of type j that each 
particle of type i has is given by Ke(i, j)/x{j}, we have Ke{i,j) = 2^+™'^'^^*'-'^ if 
|i — j| = 1 and Kc{i,j) — otherwise, so Kg is symmetric and irreducible. 

Define the non-linear operator <I> associated to X in the natural way, so 
$(/)(a;) is the probability that at least one child of the root of type x has a 
certain property, if each child of type y has this property independently with 
probability /(y). As before, the survival probability p{x) satisfies p = $(/o). 

Let t(x) denote the probability that the process survives transiently, i.e., 
survives forever, but, for each i, contains in total only finitely many particles 
of type i. Consider the 'forward process' given by ignoring backward children. 
This is simply a Poisson Galton- Watson process with on average 2 offspring, and 
so survives with some positive probability. Also, given that it survives, there is 
a positive probability that for every t, generation t contains at most 3* particles, 
say. But since the particles in generation t have type x + t, the expected number 
of sets of backwards children of all particles in the forward process is at most 
^(>q3*4~*~^ < oo, and with positive probability the particles in the forward 
process have no backwards children. But in this case, the forward process is the 
whole process, and the process survives transiently. Hence t{x) > for every 

X. 

Let a{x) = p{x) — t(x) be the probability that the process survives recur- 
rently. Considering the children of the initial particle, we see that a = $((t). 
The process restricted to any two consecutive types is already supercritical, 
and so has positive probability of surviving by alternating between these types. 
Thus a{x) > for all x. We showed above that t{x) ~ p{x) — cr(x) > for 
all x, so < a{x) < p{x), and / = $(/) has (at least) two non-zero solutions, 
namely a and p. 

Let us turn to the analysis of the solution p„ to (|8]), and in particular the 
question of when p > 0, i.e., when the branching process X^ is supercritical. 
Throughout we consider an integrablc hypcrkcrncl k, with corresponding edge 
kernel He- 

Recall that we may assume that X{x) = Sk{^){x) is finite everywhere. Hence, 
for any / satisfying ([8]), we have f{x) < 1 for all x. On the other hand, we cannot 
assume that n^ is intcgrable, or indeed finite. For one natural example, consider 
the integrablc hypcrkcrncl with each Kr constant, and Kr = 1/r^. In this case 
Keix, y) ~ OO for all x and y. If Kq is infinite on a set of positive measure, then 
we take ||Tk^|| to be infinite. 

Lemma 2.6. //||r^J| < 1, then p{k) = 0. 

Proof. Suppose that / is a solution to ^ that is not a.e. Since ~ log(l — i) > t 
for < i < 1, we have SK{f){x) > f{x), with strict inequality on a set of 
positive measure. But T,~if)ix) > S^if)ix) by (HI]), so T^Af){x) > fix), 
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with strict inequality on a set of positive measure. Hence |1Tk^/|12 > II/II2, so 

Lemmas 5.12 and 5.13 of |10| carry over to the present context, with only 
minor modifications. Given functions /i, /2, . . . and /, we write /n /* / if the 
sequence (/„) is monotone increasing and converges to / pointwise. 

Lemma 2.7. If < f < 1 and $«,(/) > /, then $;!'(/) y g as m ^ 00, for 
some 1 > 5 > / with ^^ig) = g. 

Proof. Since / < $«(/), monotonicity of $k gives ^K.{f) < ^^nif) ^-^d, by 
induction, $^'(/) < *™+^(/) for all m > 0. Since < $^'(/) < 1, it follows that 
g{x) = limm^oo ^™(/)(^) exists for every x, and < g < 1. From monotone 
convergence we have S^{g) = limm^oo 'S'k(<&™(/)), from which it follows that 
^M = 9- □ 

Lemma 2.8. // there is a function f : S —t [0, 1], not a.e. 0, such that 5'k(/) > 
(1 + S)f for some d > 0, then p{k) > 0. 

Proof. The proof is the same as that of Lemma 5.13 in [10] . using S'k in place 
ofT,. ~ D 

The next step is to show that if ||Tk^ II > li then there is a function / with the 
property described in Lemma l2.8l In [lOj we did this by considering a bounded 
kernel. Here we have to be a little more careful, as we are working with the 
non- linear operator 5^ rather than with T^^ ; this is no problem if we truncate 
our kernels suitably. 

Definition 2.9. We call a hyperkernel k, = (Kr)r>2 hounded if two conditions 
hold: only finitely many of the k,. are non-zero, and each k,- is bounded. 

Similarly (for later use), a general kernel family {kf)f£J= is bounded if only 
finitely many of the kf are non-zero, and each kf is bounded. 

In other words, k is bounded if there are constants R and M such that 
Kr = for r > R, and Kr is pointwise bounded by M for r < R. Note that 
if K is bounded, then the corresponding edge kernel Ke is bounded in the usual 
sense. 

Given a hyperkernel k = (k^), for each M > we let k*^ be the bounded 
hyperkernel obtained from k by truncating each Kr, r < Af , at Af , and replacing 
Kr by a zero kernel for r > M. Thus 



_JKrAM, r < M, 



|0, r>M. ^ ^ 

The truncation k^^ = ('*F^)FeJ^ of a- general kernel family {KF)Fey^ is defined 
similarly, replacing the condition r < M by |i^| < M. 
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Lemma 2.10. // HT^Jl > 1 then there is a S > and an f : S ^ [0,1], not 
a.e. 0, such that S^{f) > (l + S)/. 

Proof. We slightly modify the proof of Lemma 5.16 of [TO]. 

Consider the truncated hyperkernels k^^ defined in (|18p . From (fTO| and 
monotone convergence, the corresponding edge kernels k^^ tend up to Kq (which 
may be infinite in some places) pointwise. Arguing as in the proof of Lemma 
5.16 of [T^, since |1Tk^|| > 1 there is some positive / with ||/||2 — 1 and 
1 < \\Ti^Jh < oo- By monotone convergence, T^mJ / Ti^J, so WT^Mfh / 
II T« J II 2, and there is some M with ||r„M|| > ||T^a//||2 > 1. 

Since ^f is bounded, setting 8 = (||T^m|| - l)/2 > 0, by Lemma 5.15 of [TO] 
it follows that there is a bounded / > with / not a.e. such that 

7^«-/=r«-ll/ = (l + 25)/. 

We may assume that 0</<l. If0<2/i<7<l,i = l,...,r, then (by 
induction) 1 — 111=1(1 ^ Vi) ^ (1 ^ 7)*^^^ Si=i Vi-t ^^^ it follows that if 7 > is 
chosen small enough, then 

5„m(7/) > (1 - 7)^^-1t,m(7/) > (1 + 5)(7/). 

Since SK.i'yf) > Sf^M{jf), the result follows. D 

Theorem 11.71 follows by combining the results above. 



Proof of Theorem\r^ Together Lemmas [2:61 [2:81 and [2T0l show that p{k) > if 
and only if HTkcH > 1- Uniqueness is given by Theorem 1 2. 41 The final statement 
is immediate from Lemma 12.31 D 



Having proved Theorem ll.71 our next aim is to prove Theorem ll.5l The basic 
strategy will involve comparing the neighbourhoods of a vertex in the random 
graph G{n, k) with the branching process Xk,- As in |TD], it will be convenient 
to carry out the comparison only for certain restricted hyperkernels. In order to 
deduce results about G{n,K) in general, one needs approximation results both 
for the graph and for the branching process. We now turn to such results for 
branching processes. 

Lemma 6.3 and Theorems 6.4 and 6.5 of |10| carry over to the present con- 
text, mutatis mutandis, using the results above about p{k) instead of the equiva- 
lents in [To], and replacing T^ by S^ or T^o as appropriate: S^. when considering 
$K, and Tko when arguing using L^-norms. In these results p^ denotes the func- 
tion X i-^ Pk,{x), and p>fc(K, x) and p>k{!i) denote respectively the probabilities 
that Xk,{x) and X^ have total size at least k, where the size of a branching 
process is the total number of particles in all generations. 

Lemma 2.11. If k < k' , then p{k) < p{k'). □ 
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Theorem 2.12. (i) Let Kn, n = 1,2,..., he a sequence of hyperkernels on 
{S,fj.) increasing a.e. to an integrable hyperkernel k. Then Pk„ /^ Pk i-e. and 

P(Sn) / p{k)- 

(ii) Let Kn, ri = 1, 2, . . . , be a sequence of integrable hyperkernels on (S, fj.) 
decreasing a.e. to k,. Then Pk„ \ Pk ci.e. and p(k„) \ p{k). D 

Theorem 2.13. (i) Let Kn, n = 1,2,..., be a sequence of hyperkernels on 
(Syfj.) increasing a.e. to a hyperkernel k. Then, for every k > 1, p>ki!Sn',x) y 
p>k{K]x) for a.e. x and p>kiiin) / P>k{ti)- 

(ii) Let Kn, n = 1, 2, . . . , be a sequence of integrable hyperkernels on (S, p,) 
decreasing a.e. to k. Then, for every k>\, p>k{an]x) \ p>kiSi]x) for a.e. x 
and p>kitin) \ P>fc(s)- □ 



Remark 2.14. The assumption that /s;„ be integrable in Thcorems l2.12r ii) and 
I2.13r ii) can be weakened to A^^ (x) < oo for a.e. x, where Ak„ (x) is the expected 
number of child cliques in X„„ of a particle of type x; see jlOj . 

3 Local coupling 

We now turn to the local coupling between our random graph and the corre- 
sponding branching process, relating the distribution of small components in 
G{n,K) to the branching process X^. In [TUl, we were essentially forced to con- 
dition on the vertex types, since these were allowed to be deterministic to start 
with. Here, with i.i.d. vertex types, there is no need to do so. This allows us to 
couple directly for all bounded hyperkernels, rather than simply for finite type 
ones. 

We shall consider a variant of the usual component exploration process, 
designed to get around the following problem. When we test edges from a given 
vertex v to all other vertices, the probability of finding a given edge vw depends 
on the type of w as well as that of v. Hence, not finding such an edge changes 
the conditional distribution of the type oi w. If the kernel is well behaved, it 
is easy to see that this is a small effect. Rather than quantify this, it is easier 
to embed G{n, k) inside a larger random graph with uniform kernels. Testing 
edges in the larger graph does not affect the conditional distribution of the 
vertex types; we make this precise below. In doing so, it will be useful to take 
the hypergraph viewpoint: given a hyperkernel k, let H{n^ k) be the hypergraph 
on [n] constructed according to the same rules as G{n,K), except that instead 
of adding a Kr we add a hyperedge with r vertices. In fact, we consider the 
Poisson version of the model, allowing multiple copies of the same hyperedge. 

Let K be a bounded hyperkernel, and let /«+ be a corresponding upper bound, 
so K+ is the constant kernel M for r < i?, and zero for r > R, while Kr < k+ 
holds pointwise for all r. 

Taking, as usual, our vertex types xi, . . . ,Xn G 5 to be independent, each 
having the distribution p, we construct coupled random (multi-)hypergraphs 
Hn and -ff+ on [n] as follows: first construct -ff+ = H{n,K^) by taking, for 
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every 2 < r < R, a. Poisson Po{rlM/n^~^) number of copies of each possible r- 
element hyperedge, with all these numbers independent. Although in our formal 
definition of H^ we first decide the vertex types, H^ is clearly independent of 
these types. Hence, given 7?+, the types are (still) i.i.d. with distribution /i. 

Given iJ+ and the i.i.d. types Si, . . . ,x„ of the vertices, we may form Hn 
by selecting each hyperedge {vi, . . . ,Vr} of i7+ to be a hyperedge of H^ with 
probability Kr{xy-^ , ■ ■ ■ , Xv^)/M^ independently of all other hypcrcdges. It is easy 
to see that this gives the right distribution for Hn = H{n,K). (If we disallowed 
multiple copies of an edge, there would be an irrelevant small correction here.) 

Turning to the branching processes, there is an analogous coupling of X^ and 
Xk+ : first construct Xf^+ , which may be viewed as a single-type process, accord- 
ing to our two-step construction via child cliques. Then assign each particle a 
type according to the distribution /x, independently of the other particles and of 
the branching process. Then form the child cliques in X^ by keeping each child 
clique in Xk+ with an appropriate probability depending on the types, deleting 
not only the children corresponding to deleted child cliques, but also all their 
descendants. 

Let V G [n] be chosen uniformly at random, independently of iJ„ and H^ . 
Let Td denote the d- neighbourhood of v in _ff„, and Pj that in H^. Counting 
the expected number of cycles shows that for any fixed d, the hypcrgraph Fj is 
whp treelike. Furthermore, standard arguments as for G{n, c/n) show that one 
may couple FJ and the first d generations of Xk+ so as to agree in the natural 

sense whp. When FJ" is treelike, then F^ C FJ" may be constructed using exactly 
the same random deletion process that gives (the first d generations of) X^ as 
a subset of Xk+ . It follows that F^ and the first d generations of X^ may be 
coupled to agree whp. 

Recalling that G(n, k) and iJ„ have the same components, for any fixed 
k > 1 one can determine whether the component containing v has exactly k 
vertices by examining F^+i. Writing Nk{G) for the number of vertices of a 
graph G that arc in components of size k, it follows that 

EiVfe(G(n,s)) = nPdXgl = k) + o(n). 

As in |10| , starting from two random vertices easily gives a corresponding second 
moment bound, giving convergence in probability. 

Lemma 3.1. Let k be a bounded hyperkernel. Then 

-iVfc(G(n,£))J^P(|X«|-fc) 
n 

for any fixed k. D 

Of course it makes no difference whether we work with Nk or N>k = n — 
Z_/i=i ^j- Lemma ISJI also tells us that 

-N>k{G{n,K))^F{\X^\>k). (19) 

n ~ 

The extension to arbitrary hypcrkernels is easy from Theorem 12.131 

19 



Lemma 3.2. Let k, be an integrable hyperkernel. Then for each fixed k we have 

-N>kiG{n,K,))^p>k{ti). 
n ~ " 

Proof. As in [lOj . we simply approximate k by bounded hyperkernels. For M > 
let K*^ be the truncated hyperkernel defined by ([T5)) . 

Let fc > 1 be fixed, and let e > be arbitrary. From monotone convergence 
and integrability, 






SO for M large enough we have 






say. By Theorem I2.13( i) , increasing M if necessary, we may also assume that 

p>k{&^')>P>k{'S)-s/3. (20) 

Since k^'^ < k holds pointwisc, we may couple the hypergraphs H'^ and if„ 
associated to G{n,K^^) and G{n, k) so that H^ C i/„. Recall that G{n,K) 
is produced from iJ„ by replacing each hypcrcdge E with r vertices by an r- 
clique. However, as noted earlier, if we form Gn from _ff„ by replacing each E 
by any connected simple graph on the same set of vertices, then Gn and G{n, k) 
will have exactly the same component structure, and in particular -/V>fc(G'„) = 
N>k{G{n, k)). Let us form G„ and G",j in this way from i7„ and H'^, replacing 
any hyperedge with r vertices by some tree on the same set of vertices. Recalling 
that iJ^ C H„, we may of course assume that G'^ C Gn- 

Writing er{H) for the number of r- vertex hyperedges in a hypergraph H , 



^\E{Gn) \ E{G'n)\) < ^(r - l)E{er{Hn) - er{K)) 

r>2 

^ (r - l)n / {Kr - Kf ) < nA. 



r>2 

< 



r>2 

Hence, 

P{\E{Gn) \ E{G'n)\ > en/6k) < nA/{en/6k) < e. 

Recalling that G^ C G„ and noting that adding one edge to a graph cannot 
change N>k by more than 2fc, we see that with probability at least 1 — e we 
have 

\N>k{G{n,n^'))-N>k{G{n,K))\ = \N>k{G'n)~N>k{Gn)\ < 2k{en/&k) = en/i. 

Applying Lemma |3. II for rather (J19p ) to the bounded hyperkernel k^^, we have 

-N>k{G{n, K^^)) -^ /0>fc(jS^^)- Using ((20)) it follows that when n is large enough, 
with probabihty at least 1— 2£, say, we have |iA^>fc(G(n, k))— p>fe(K)| < e. Since 
e > was arbitrary, we thus have iiV>fc(G(n, k)) —> p>k{is) ^^ required. D 
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4 The giant component 

The local coupling results of the previous section easily give us the 'right' num- 
ber of vertices in large components. As usual, we will pass from this to a giant 
component by using the 'sprinkling' method of Erdos and Renyi [26| . first un- 
covering the bulk of the edges, and then using the remaining 'sprinkled' edges 
to join up the large components. The following lemma gathers together the 
relevant consequences of the results in the previous section. 

Lemma 4.1. Let k = (k^) be an integrable hyperkernel, and let G„ = G{n,K). 
Then Ci(G„) < p{K)n + Op{n). Furthermore, given any e > 0, there is a S > 
and a function lo = a;(n) —^ oo such that 

N>UG:,) > {p{k) - e)n (21) 

holds whp, where GJj = G{n, (1 — S)k,). 

Proof. From Lemma [3.21 we have —N>k{Gn) -^ P>fc(£) for each fixed k. Since 
P>k{ti) ~^ pits) s-s k —f cxD, it follows that for some uj = Lu{n) —^oo^we have 

V>„(G„)^p(fi); (22) 



n 



we may and shall assume that oj = o{n). Since Gi(G„) < ma.x{uj,N>^{Gn)}, 
the first statement of the lemma follows. 

For the second, we may of course assume that p{k) > e; otherwise, there is 
nothing to prove. As (5 ^ 0, from Theorem 12. 12r i) we have p((l — S)k) -^ p(k)- 
Fix 5 > with p((l - S)k) > p{k) - e/2, and let G^ = G(n, (1 - 5)k). Applying 
(|22|) to G'^, there is some u) = uj{n) -^ cx) such that 



N = N>^{G'^) = p{{l - 5)K)n + Op(n) > {p{k) - e/2)n + Op(n), 
which implies pT|) . D 



In the light of Lemma l4Tl and writing G„ for G{n, g), to prove Theorem ll.51 
it suffices to show that if k is irreducible, then for any e > we have 

Ci(G„)>(p(£)-2e)n (23) 

whp; then Gi(G„)/n — > p(k) as required. Also, from ([22|) and the fact that 
C'i(G„) -I- G2(G„) < max{2w, Af>„(G„) 4- cj}, we obtain G2(G„) = Op(n) as 
claimed. 

Since (1 — 8)k < k, there is a natural coupling of the graphs G^ and G„ 
appearing in Lemma l4. II in which G^ C G„ always holds. Our aim is to show 
that, whp, in passing from G^ to G„, the extra 'sprinkled' edges join up almost 
all of the N vertices of GJ^ in 'large' components (those of size at least ui) into 
a single component. 

Unfortunately, we have to uncover the vertex types before sprinkling, so we 
do not have the usual independence between the bulk and sprinkled edges. A 
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similar problem arose in Bollobas, Borgs, Chayes and Riordan [3] in the graph 
context, as opposed to the present hypergraph context. It turns out that we can 
easily reduce to the graph case, and thus apply a lemma from [S] . This needs a 
little setting up, however. Here it will be convenient to take S = [0, 1] with /i 
Lebesgue measure; as noted in Section [U this loses no generality. 

Let / be a bounded symmetric measurable function / : [0, l]'^ ^ M. Follow- 
ing Frieze and Kannan |28| . the cut norm ||/||n of / is defined by 



|/||n= sup 

S,TC[0,1] 



f{x,y)dxdy 

SxT 



where the suprcmum is taken over all pairs of measurable sets. Note that H /1| n < 
II /111, since the integral above is bounded by /^^^ |/| < /,q ^,2 |/|- 

Given a kernel k on [0, 1] and a measurable function ip : [0, 1] -^ [0, 1], let 
K^"^) be the kernel defined by 

K^'f^x^y) = K{(p{x),(p{y)). 

If (^ is a measure-preserving bijection. then k^*^' is a rearrangement of k. (One 
can also consider measure-preserving bijections between subsets of [0, 1] with 
full measure; it makes no difference.) We write k ~ k' if k' is a rearrangement 
of K. 

Given two kernels k, k' on [0, 1], the cut metric of Borgs, Chayes, Lovasz, 
Sos and Vesztergombi [18] is defined by 

Sf-\{K,K,)= inf ||k — K lln- 

Note that this is a pseudo-metric rather than a metric, as we can have S\j(k, k') = 
for different kernels. (Probabilistically, it is probably more natural to consider 
couplings between kernels as in [T^, rather than rearrangements, but this is 
harder to describe briefly and turns out to make no difference.) 

Let An be a symmetric n-hy-n matrix with non- negative entries Oij, which 
we may think of as a (dense) weighted graph. There is a piecewise-constant 
kernel ka^ associated to An] this simply takes the value aij on the square 
((i — \)/n,i/n] x ((j — l)/n, j/n], 1 < i,j < n. There is also a sparse random 
graph G{An) associated to A„; this is the graph on [n] in which edges are present 
independently, and the probability that ij is an edge is aij /n. (If An has non- 
zero diagonal entries then G(A„) may contain loops. These are irrelevant here.) 

The main result of Bollobas, Borgs, Chayes and Riordan [3] is that if k, is an 
irreducible bounded kernel and (A„) is a sequence of matrices with uniformly 
bounded entries such that i5n(K^^, k) — > 0, then the normalized size of the giant 
component in G(A„) converges in probability to p{k). The sprinkling argument 
there relies on the following lemma concerning the graph G(j4„), in which edges 
are present independently. 

Lemma 4.2. Let k, he an irreducible bounded kernel on [0, 1], and 5 and /^max 
positive constants. There is a constant c = c(k, /Jmax, (5) > such that when- 
ever An is a sequence of symmetric matrices with entries in [0,/3max] with 
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S\j{ka„, k) —f 0, then for sufficiently large n we have 

nVn -G(A^) K) > 1 - exp(-cn) 

for all disjoint ¥„, V^ C [n] with \Vn\, \V^\ > Sn, where Vn ^g{A„) ^ denotes 
the event that G(A„) contains a path starting in Vn and ending inV^- □ 

In fact, this lemma is not stated explicitly in [9], but this is exactly the 
content of the end of Section 3 there; for an explicit statement and proof of (a 
stronger version of) this lemma see |T2] Lemma 2.14]. 

We shall apply Lemma |42] to graphs G(A„) corresponding to (subgraphs of) 
G{n, Sk), where S is as in Lemma |4. II To achieve independence between edges, 
we shall simply take only one edge from each hyperedge. Unfortunately, the 
problem of conditioning on the Xi still remains; we shall return to this shortly. 

Definition 4.3. Let k be an integrable hyperkernel and let iJ„ be the Pois- 
son (multi-)hypergraph corresponding to G{n,K). Given the sequence x = 
(xi, . . . , a;„), let G{n,K,x.) be the random (multi-)graph formed from iJ„ by 
replacing each r- vertex hyperedge _E by a single edge, chosen uniformly at ran- 
dom from the (2) edges corresponding to E. 

With X fixed, the numbers of copies of each edge E in iJ„ are independent 
Poisson random variables. From basic properties of Poisson processes, it follows 
that, with X fixed, the number of copies of each edge ij in G{n,K,x.) are also 
independent Poisson random variables. Our next aim is to calculate the edge 
probabilities in G{n, k, x). 

As usual, we write aj^) for the falling factorial a{a — 1) • • • (a — 6+ 1). Given 
xi, . . . ,Xn and distinct i,j G [n], for r > 2 let 

where the sum runs over all (n — 2)(-^_2) sequences k^, . . . ,kr of distinct indices 
in [n] \ {i,j}, and let A be the n-by-n matrix with entries 

r>2 

for i j^ j and a.^ = if i = j. 

With X given, the expected number of ?'- vertex hyperedges in H„ containing 
ij is r{r — l)ar^ij/n. Hence the expected number of ij edges in G{n,K,x.) 
is exactly a^jn. Now Oij clearly depends on Xi and Xj. Unfortunately, it 
also depends on all the other Xk- The next lemma will show that the latter 
dependence can be neglected. 

Set 

Kr(a;,y,*)= / Kr{x,y,X-i,Xi,. . . ,Xr)d\i{xz)- ■ ■ d[l{Xr), 



23 



and let t be the 're-scaled' edge kernel defined by 

T{x,y)^2^Krix,y,'i'). (25) 

r>2 

Comparing with the formula pop for Kc(x,y), note that we have divided each 
term in the sum in PT!]) by (2), the number of edges in Kr- Note that 

r(x,2/) = ^=^ K,{x,y)^0. (26) 

Recall that Ur^i.j and aij depend on the random sequence x. In the next 
lemma, the expectation is over the random choice of x; no graphs appear at this 
stage. 

Lemma 4.4. Let k, = {Kr)r>2 be an integrable hyperkernel. Then 

E— ^ |ar,ij - Hr{xt,Xj,*)\ ^ 0(1) (27) 

for every r, and 

^^Y.W^-r{x,,Xj)\=o{l). (28) 

Proof. We have 

E(ar,ij I Xi,Xj) = (n- 2)(r_2)»^~''''^^''«r(a;j,2;j,*)- 
Suppose first that k,. is bounded. Let 

Yij = ar.ij - (n- 2)^r_2)7l^'^''^^^Kr{Xi,X-j,*) 

=^ n / \ Hr\Xi, Xj, 2)^3, . . . , Xk^) f^ryXi, Xj, *JJ, 

where the sum again runs over all (n — 2)(r-2) sequences k^, . . . ,kr of distinct 
indices in [n] \ {i,j}. Given Xi and Xj, each term in the sum has mean 0, and 
any two terms with disjoint index sets {k^, . . . , kr} are independent. Since there 
are 0{n^^~^) pairs of terms with overlapping index sets, and Kr is bounded, we 
have 

E(y,2 I ^^,:c,) = 0(n2'-5-2('-2)) = o(„-i). 

Thus EFj^. = 0{n^^). Hence, by the Cauchy-Schwarz inequality, E|1^;j| < 



E^^ |a„,, - Krix,,Xj,*)\ = ^X! ^1^*^-1 + ^(1/") = 0(n"'/'). 
This proves ((27|) and thus ([28|) for bounded hyperkerncls. 
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For general hyperkernels, we use truncation and define k^ by p^ . For the 
corresponding a^j ', A^*^) and r*^, 



1^ ., ■ J 

'^ ., . J 



and thus 



K^ ), 






Since (k^.) is integrable, given any e > we can make these expected differences 
less than e by choosing M large enough, and the result follows from the bounded 
case. n 

With the preparation above we are now ready to prove Theorem 1 1.51 

Proof of Theorem ] 1.5[ We assume without loss of generality that S = [0,1], 
with n Lebesgue measure. 

Let k' = (K'p)p,zjr be an irreducible, integrable kernel family, let k = {Kr)r>2 
be the corresponding hyperkernel, given by (O, and let e > 0. As noted after 
Lemma 14.11 in the light of this lemma, it suffices to prove the lower bound (j23p 
on Ci (0(71,15)). We may and shall assume that p{k) > and e < p(k)/10, say. 

Let 6 > and w = Lu{n) be as in Lemma [4.11 and let iJ„, H^^ and iJ„ be 
the Poisson multi-hypergraphs associated to the hyperkernels k, (1 — S)k and 
5k, respectively. Using the same vertex types x — {xi, . . . ,x„) for all three 
hypergraphs, there is a natural coupling in which i7„ = i7^ U Hn, with H'^ and 
Hn conditionally independent given x. 

Define A and r by p4)) and ((25)) . respectively, starting from the integrable 
hyperkernel 6k. Note that t is a kernel on [0, 1], while A is an n-by-n matrix 
that depends on x. Recall from ([26]) that t(x, y) = if and only if Kc{x, y) = 0, 
so r is irreducible. In order to be able to apply Lemma F4. 21 we would like to 
work with a bounded kernel and matrices that are bounded uniformly in n. We 
achieve this simply by considering A = (Sy ) and r defined by 

Ojj ^ Tmn{aij, 1} and t{x, y) = min{r(a;, y), 1}. 

Let B be the (random) 'sampled' matrix corresponding to r, defined by 

Ojj TyXi, Xj ), 
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and let B be the corresponding matrix associated to t. The second statement 
of Lemma 14.41 tells us exactly that 



^—'^\°'^3 ~^yl =0(1)' 



where the expectation is over the random choice of x. Since jay — 5y | < jay — 6y | 
for i ^ j, while \an — hn\ < 1, it follows that 

E^^|a»j ~hj\ =o(l), 

or, equivalently, that E||k^— '^ulli = o(l); where we write km for the piecewise 
constant kernel km ■ [0, 1]^ ^ M associated to a matrix M. 

Since (5n(Ki, K2) < n't! — ^^2110 ^ ||ki — k2|!i, it follows that E(5n(K^,K-g) ^ 0, 
and hence that S\j{k-^, k-^) —> 0. Coupling the random sequences x for different 
71 appropriately, we may and shall assume that 

^□(k7,a«b)^0 (29) 

almost surely. 

Since r is a bounded kernel on [0,1], i.e., a 'graphon' in the terminology 
of [H], Theorem 4.7 of Borgs, Chayes, Lovasz, Sos and Vesztergombi [Hj tells 
usthat with probability at least 1-e"" /(zioga") ::^ l-o(l), wehave(5n(K-g, r) < 
10supT/-\/log2 ". = 0(1). It follows that (5n(K;g-, t) -^ both in probability and 
almost surely. Using (|29|1. we see that 

So{k^,t)^0 (30) 

almost surely. Note that k-^ depends on the sequences x. 

Let G'^ and G„ be the simple graphs underlying H^ and _ff„. From Lemma lTT] 
(j2ip holds whp. For the rest of the proof, we condition on x and on H'^. 
We assume, as we may, that (PT|) holds for all large enough n, and that (|30p 
holds. It suffices to show that with conditional probability 1 — o(l) we have 
Ci{Gn) > {pin) — 2e)n. Recall that, given x, the (multi-)hypergraphs H^^ and 
Hn are independent, so after our conditioning (on x and H'^), the hypergraph 
H„ is formed by selecting each r-tuple ui , . . . , w^ to be an edge independently, 
with probability 6Kr{xv-^ , • ■ • , a^t>,. )/i'^^- 

Let Gn = G{n, Sk, x) be the random (multi-)graph defined from iJ„ by 
taking one edge from each hyperedge as in Definition 14.31 noting that G^j UGn C 
Gn- Since we have conditioned on x (and GJJ, as noted after Definition 14. 3|, 
each possible edge ij is present in G„ independently. In the multi-graph version, 
the number of copies of ij is Poisson with mean Oij/n. Passing to a subgraph, 
we shall take instead the number of copies to be Poisson with mean Oij /n. Since 
this mean is 0(l/7i), the probability that one or more copies of ij is present is 
a[j/n, where a'^,. = Oij + 0{l/n). Since S^^ka' , k-^) = 0{l/n) ~ o(l), we have 



26 



^aif^A'jT) -^ 0. Since t is an irreducible bounded kernel, the (simple graphs 
underlying) Gn satisfy the assumptions of Lemma 14.21 so there is a constant 
c > such that for any two set Vn, V^ of at least en/2 vertices of Gn, the 
probability that l/„ and 1/' arc raoi joined by a path in G„ is at most e~™. 

Recall that wc have conditioned on G'„, assuming ((2T|) . Suppose also that 
Gi(G„) < {p{k,) — 2e)n. Then there is a partition {Vi, V2) of the set of vertices 
of G^j in large components in G^ with |Vi|, IV2I > en such that there is no path 
in Gn from Vi to V2. Let us call such partition (Vi, V2) a bad partition. Having 
conditioned on G^, noting that in any potential bad partition Vi must be a union 
of large components of G^, the number of possible choices for (Vi, V2) is at most 
2"/" = e"'-"-'. On the other hand, since G„ C G„, the probability that any given 
partition is bad is at most e~'^", so the expected number of bad partitions is 
0(1), and whp there is no bad partition. Thus Gi(G„) > (p(k) — 2e)n holds 
whp, as required. D 



Remark 4.5. The restriction to irreducible kernel families in Theorem 11.51 is 
of course necessary; roughly speaking, if k is reducible, then our graph G{n, k) 
falls into two or more parts. Lemma 14.11 still applies to show that we have 
p{K)n + Op{n) vertices in large components, but it may be that two or more 
parts have giant components, each of smaller order than p{K)n. 

More precisely, let k be a reducible, integrable kernel family. Thus the edge 
kernel Ko is reducible. By Lemma 5.17 of |10j . there is a partition S = Ui=o '^«' 
N < 00, of our ground space S (usually [0, 1]) such that each Si is measurable, 
the restriction of Kc to Si is irreducible (in the natural sense), and, apart from 

a measure zero set, Kq is zero off ljj:=i ^i ^ ^i- 

Suppressing the dependence on n, let Gi be the subgraph of G(n, k) induced 
by the vertices with types in Si. Since the vertex types arc i.i.d., the probability 
that G{n, k) contains any edges other than those of ljj>]^ Gi is 0. Now Gi has a 
random number rii of vertices, with a binomial Jii{n, fiJSi)) distribution, which 
is concentrated around its mean. Given n^, the graph Gi is another instance of 
our model. 

Let Qi = Jc. Pk{x) dp{x), so that ^,- a^ = p{k) < 1. From the remarks above 

it is easy to check that Theorem 1 1 . 5 1 gives Gi{Gi)/n -^ ai and G2{Gi) = Op(n); 
we omit the details. Sorting the a,; into decreasing order 01,02, .. ., it follows 
that Ci{G{n,K)) = aiU + Op{n) for each fixed (finite) 1 < i < A'^, in particular, 
for 1 = 1 and i = 2. 



5 Disconnected atoms and percolation 

One of the most studied features of the various inhomogeneous network models is 
their 'robustness' under random failures, and in particular, the critical point for 
site or bond percolation on these random graphs. For example, this property of 
the Barabasi-Albert [5] model was studied experimentally by Barabasi, Albert 
and Jeong [2, heuristically by Callaway, Newman, Strogatz and Watts ^T\ 
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(see also [T]) and Cohen, Erez, ben-Avraham and Havlin [55], and rigorously 
in [141 [36]. In the present context, given < p < 1, we would like to study 
the random subgraphs G^^^ (n, k) and G^pI (n, k) of G{n, k) obtained by deleting 
edges or vertices respectively, keeping each edge or vertex with probability p, 
independently of the others. In the edge-only model of [TU] , these graphs were 
essentially equivalent to other instances of the same model: roughly speaking, 
G^P\n,K) = G{n,pK) and G^\n,K) ^ G{pn,pK). (For precise statements, 
see [ini Section 4].) 

Here, the situation is a little more complex. When we delete edges randomly 
from G{n, k), it may be that what is left of a particular atom F is disconnected. 
This forces us to consider generalized kernel families {KF)Feg with one kernel 
kf for each F E Q, where the set G consists of one representative of each 
isomorphism class of finite (not necessarily connected) graphs. 

Rather than present a formal statement, let us consider a particular exam- 
ple. Suppose that k is the generalized kernel family with only one kernel kf, 
corresponding to the disjoint union F of A'3 and K2. Let k' be the kernel family 
with two kernels. 



K3{x,y,z)= / KF{x,y,z,u,v)d^{u)dfi{v), 

corresponding to A'3 and 

K2(u,w)= / KF{x,y,z,u,v)d^i{x)d^{y)dfi{z) 

for i^2- Then G{n,K) and G{n,K/) are clearly very similar; the main differences 
are that G(n, k) contains exactly the same number of added triangles and K2S, 
whereas in G(n, k') the numbers are only asymptotically equal, and that in 
G{n, k) a triangle and a K2 added in one step are necessarily disjoint. Since 
almost all pairs of triangles and K2S in G{n,K') are disjoint anyway, it is not 
hard to check that G{n,K) and G{n,K') are 'locally equivalent', in that the 
neighbourhoods of a random vertex in the two graphs can be coupled to agree 
up to a fixed size whp. 

More generally, given a generalized kernel family k = {nF)F^g, let k' be 
the kernel family obtained by replacing each kernel kf by one kernel for each 
component F' of F, obtained by integrating over variables corresponding to 
vertices oi F \ F' as above. This may produce several new kernels for a given 
connected F'; we of course simply add these together to produce a single kernel 
Kp,. Note that 



E 

F 



Z\F'\I n'F,=Y.\F\f KF, 



so if K is integrable, then so is k'. Although G{n,K) and G{n,K') are not 
exactly equivalent, the truncation and local approximation arguments used to 
prove Theorem 11.51 carry over easily to give the following result. 
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Theorem 5.1. Let k = {nF)Feg ^6 f^ generalized kernel family, 
corresponding kernel family as defined above, and let k" = R' be the hyperkernel 
corresponding to k' , defined by ([5]). If k' is irreducible, then 

Ci{G{n,K))=p{K")n + 0p{n), 

and C2{G{n, k)) = Op(n). D 

Note that the hyperkernel k" corresponding to k' is obtained by replac- 
ing each (now connected, as before) atom F' by a clique; this corresponds to 
replacing each component of an atom F in G{n,K) by a clique. 

Turning to bond percolation on G{n, k), i.e., to the study of the random sub- 
graph G'^^'\n,K) of G{n,K), let k^^^ be the kernel family obtained by replacing 
each kernel kf by 2'^'^^^ kernels kf' = p*^'^ \^ — p)e(_F)-e(_F )^^^ ^^^^ -j-^j. qq^^Jj 
spanning subgraph of F. (As before, we then combine kernels corresponding to 
isomorphic graphs F' .) Working work with the Poisson multigraph formulation 
of our model, the graphs G">P^(n, k) and G{n,K^P'^) have exactly the same dis- 
tribution. This observation and Theorem 15. II allow us (in principle, at least) to 
decide whether G^^'{n,K) has a giant component, i.e., to find the critical point 
for bond percolation on G{n,K). 

Let us illustrate this with the very simple special case in which each kernel 
Kf, F G G, is constant, say Kp = cp- We assume that k is integrable, i.e., that 
X)f l^k^ *^ °°- I^^ tl^i^ case each kernel Kp making up k'P' is also constant, 
and the same applies to the hyperkernel k" corresponding to k'P' . Hence, from 
the remarks above and (fT4|) . G^^^ (n, k) has a giant component if and only if the 
asymptotic edge density £,{k") of the hyperkernel k" is at least 1/2. Since we 
obtain k" by first taking random subgraphs of our original atoms F, and then 
replacing each component by a clique, we see that 

^(s") = X! cf9f{p), 
Fer 

where Op (p) is the expected number of unordered pairs of distinct vertices of F 
that lie in the same component of the random subgraph F^^^ of F obtained by 
keeping each edge with probability p, independently of the others. Alternatively, 

2?(S")-E^^I^IW^^'^)-1)' 

F£J^ 

where x{P ) is the susceptibility of _F'^', i.e., the expected size of the compo- 
nent of a random vertex of F^^' . If wc have only a finite number of non-zero cf, 
then 5(k") niay be evaluated as a polynomial in p, and the critical point found 
exactly. 

Turning to site percolation, there is a similar reduction to another instance 
of our model, most easily described by modifying the type space. Indeed, we add 
a new type • corresponding to deleted vertices, and set n'{*) = I ~ p. Setting 
fj,'{A) ~ PfJ-iA) for A C S, wc obtain a probability measure /i' on S' = S U {•}. 
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Replacing each kernel np by 2l^l kernels np' on S' defined appropriately (with 
F' corresponding to the subgraph of F spanned by the non-deleted vertices), 
one can show that G^\n^K) is very close to (in the Poisson version, identical 
to) a suitable instance G{n',K') of our model, where n' is now random but 
concentrated around its mean pn. In the first instance k' may include kernels 
for disconnected graphs, but as above we can find an asymptotically equivalent 
kernel family involving only connected graphs. In this way one can find the 
asymptotic size of any giant component in Gl^l(n, /j); we omit the mathemati- 
cally straightforward but notationally complex details. 

6 Vertex degrees 

Heuristically, the vertex degrees in G{n, k) can be described as follows. Con- 
sider a vertex v and condition on its type a;„. The number of atoms that 
contain v then is asymptotically Poisson with a certain mean depending on k 
and Xy. However, each atom may add several edges to the vertex v, and thus the 
asymptotic distribution of the vertex degree is compound Poisson (see below for 
a definition). Moreover, this compound Poisson distribution typically depends 
on the type Xy, so the final result is that, asymptotically, the vertex degrees 
have a mixed compound Poisson distribution. In this section we shall make this 
precise and rigorous. 

We begin with some definitions. If A is a finite measure on N, then CPo(A), 
the compound Poisson distribution with intensity A, is defined as the distri- 
bution of J27LiJ^j^ where Xj ~ Po(A{j}) are independent Poisson random 
variables. Equivalently, CPo(A) is the distribution of the sum J2i, ^y of the 
points of a Poisson process {^^} on N with intensity A, regarded as a multi- 
set. (The latter definition generalizes to arbitrary measures A on (0, oo) such 
that /p i A 1 d\{t) < cx), but we consider in this paper only the integer case.) 
Since Xj has probability generating function Ez^^ = e^{i}(z-i)^ CPo(A) has 
probability generating function 

oc oc 

^CPo(A)(^) = Ez^.-i-'"^^- = eJ] z^-^^ = n e^{-'"«^^-i) = e^.-i HM^^~i)^ 

3=1 J=l 

whenever this is defined, which it certainly is for \z\ < 1. 

If A is a random finite measure on N, then MCPo(A) denotes the corre- 
sponding mixed compound Poisson distribution. From now on, for each x £ S, 
Xx will be a finite measure on N, depending measurably on x. We shall write 
A for the corresponding random measure on N, obtained by choosing x from S 
according to the distribution jj, and then taking A^;. Thus MCPo(A) is defined 
by the point probabilities 

MCPo(A){i} = / CPo{X,,){i} dn{x) 
Js 
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or, equivalently, the probability generating function 

Js Js 

Remark 6.1. Since we have assumed that A is a finite measure, E^X^ = 
A(N) < oo; thus a.s. ^ Xj < oo and only finitely many Xj are non-zero, whence 
J2j j^j < oo a.s. This verifies that CPo(A) is a proper probability distribution. 
On the other hand, the mean of CPo(A) is 

/•oc 

E CPo(A) = Y. ^^^^ = E ^'-^ij} = / ^ '^^(*)' 

which may be infinite. As a consequence, 

EMCPo(A)=/ / t dX^{t) dfi{x) <oo. (31) 

Let dxv denote the total variation distance between two random variables, 
or rather their probability distributions, defined by 

dTY{X,Y) = sup|P(X e A) - P(y e yl)|, (32) 

A 

where the supremum is taken over all measurable sets A C R. We shall use 
the following trivial upper bound on the total variation distance between two 
compound Poisson distributions. 

Lemma 6.2. //A and A' are two finite measures on N, then 

dTv(CPo(A),CPo(A')) < ||A-A'i| =^|A{j}-A'{j}|. 

i 

Proof. Let Xj ^ Po{X{j}) be as above and let A"' ^ Po(A'{j}) be another 
family of independent Poisson variables. We can easily couple the families so 
that P{Xj ^ Xj) < |A{j} - A' {ill for every j. 
Then 

ciTv(CPo(A),CPo(A')) < r(j2jX, ^ ^jXj) <J2nx, ^ Xj) 

3 3 3 

= EiH?}-n?}i- □ 

3 

Given an intcgrable kernel family k and x € S, F E T and ) e V{F) = [\F\], 
let 

Xp.jix) = kf{xi,. .. ,Xj^i,x,Xj+i, . . . ,xiF\)dn{xi) ■ ■■ dfi{xj^i)d^i{xj+i)- ■■ dfi{xiF\) 

(33) 
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be the (asymptotic) expected number of added copies of F containing a given 
vertex of type x in which the given vertex corresponds to vertex j in F. Let 
dpij) be the degree of vertex j in F, and define the measure 

A,; = ^ Y^ XF,jix)ddj,ij), (34) 

FeJ^jeV{F) 

where, as usual, Sd denotes the probability measure assigning mass 1 to d. Thus 
Xx is a measure on N, with point masses given by 

FeJ" j:dF{j}=d 

the (asymptotic) expected number of atoms containing a given vertex of type x 
and having degree d there. From ((33l) . J^ Xf,j{x) d^{x) = Jg\F\ ^f, and thus by 



/ II A, II dfiix) =J2f AF,,(x)d/.(x) = E 1^1 / "P 



< CXD. 



F,j ''^ F 



Consequently, A^: is a finite measure on N for a.e. x, and the mixed compound 
Poisson distribution MCPo(A) is defined. 

Let the random variable D = £)„ be the degree of any fixed vertex in G{n, k). 
Equivalently, by symmetry, we can take £)„ to be the degree of a uniformly 
random vertex. Furthermore, for ^ > 0, let Uf be the number of vertices with 
degree £ in G{n,K). Then the random sequence {ni/n)f^Q can be regarded 
as a (random) probability distribution, viz., the conditional distribution of the 
degree of a random vertex in G(n,K), given this random graph. Note that 
P(Ai = ^) = ^ne/n. 

Theorem 6.3. Suppose that k = {KF)FeJ^ ** ^^ integrable kernel family. Then, 
as n ^ oo, 

(i) £>„ ^ MCPo(A), and 



(ii) EAi -^ EMCPo(A) = V 2e{F) / kf = 2C( 



. = 2£(k) < oo. 
FeJ^ 

(iii) Moreover, for every fixed £, 

m = MCPo(A){^}n + Op(n) (36) 

and thus {'n-t/n)'^^ — > MCPo(A) in the space of probability measures on 

N. 

Note that the limit distribution exists for every integrable kernel family, but 
has finite expectation only if the kernel family is edge integrable. 

As usual. Theorem 16 . 31 applies to the variants of the model G{n,K) discussed 
in Section [l] In the proof, we shall mostly work with the (non-Poisson) multi- 
graph form, where we add at most one copy of a certain small graph F with a 
particular vertex set, but keep any resulting multiple edges. 
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Proof. Assume first that k is a bounded kernel family, with Kp < M and up = 
if |F| > M. Fix a vertex v & [n], and let D be the degree of v. For F ^ T 
with |F| < M and j € l^(-F), let A''f_j be the number of added copies of F that 
contain v with v corresponding to vertex j in F . Let 

Z?' = ^7Vf,,dfO-); (37) 

this is the number of edges added to v. including possible repetitions. Thus 
D = D' unless two added edges with endpoint v coincide. For any other vertex 
w, conditioned on the types x = (xi, . . . ,a;„), the number of atoms containing 
both V and w is a sum "Y^^Iv of independent Bernoulli variables /^ ^ Be(p^), 
for V in some index set. For each r = 2, . . . , M there are 0{n^~^) such variables, 
each with p^, = 0(n^^''). Hence, 

^/, > 2 I x) < J2 P-iP-2 < {Y^P-T = 0{n-'). 

Since there are n—\ possible choices for w, it follows that 

dTv(P I x), {D' I x)) < ^{D ^D'\^) = O(n-i). (38) 



Hence, in proving (i) it makes no difference whether we work with D' or with 
D, i.e., with the multi-graph or simple graph version of G(n, k). 

Conditioned on x, Np^j is a sum of independent Bernoulh variables Be(pFj-,a(x)) 
for a in some index set Apj, with pi?ja(x) = 0(n^~l-^l) given by ([1]) and 
\Ap^,\ -^O(nl^l-i). 

Let Ai^j(x) = "E^iNp^j \ x) = X]a-PKi,a(x). By a classical Poisson approxi- 
mation theorem (see [SI (1.8)]), 



{{Np^, I x), Po(Aj.,,(x))) < ^pi.,j,„(x)2 = O(ni-I^l) = O(n-i). (39) 



aTV( 

(This follows easily from the elementary (iTv(Be(p), Po(p)) < p'^; see e.g. [6l 
page 4 and Theorem 2.M] for history and further results.) Furthermore, given 
X, the random variables A^fj are independent, and thus (|37p and ()39p imply 
that if ^Fj "^ Po(Afj(x)) are independent, then 

dTv((i?' I x), ^dF(i)XFj) = O(n-i). 

Since '^p ^ dp{j)Xpj has a compound Poisson distribution CPo(A(x)) with 
intensity A(x) == X^Fj •^Fj(x)5dy(j), we have 

dTv((^' I x), CPo(A(x))) = O(n-i). 

By (I55|) and Lemma [H^l this yields 

dTv((^ I x), CPo(A,J) < O(n-i) + llA(x) - A,J|. 
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In particular, for every £ eN, taking A = {i} in 

= £ I x) - CPo(A,J{£}| < O(n-i) + ||A(x) - A,J|. (40) 



Taking the expectation of both sides, and noting that EP{D = ^ | x) = P{D ~ £) 
and ECPo(A^J{^} = MCPo(A){£}, we find that 

|P(i:i = ^)-MCPo(A){^}| <0(n"i)+E||A(x)-A:,J|. (41) 

We shall show that the final term is small. 
By O, withr= |F|, 

Afj(x) = n^^''^KF{Xy^,. .. ,Xy^), 

where the sum runs over all (n — 1 ) (r_ i) sequences vi, . . . ,Vr of distinct elements 
in [n] with Vj = v. Consequently, by ([33]) . 

E(Af,j(x) I xy) = (1 - Oin~^))XF,jixy). (42) 

Recalling that k is bounded, it is easy to check (as in the similar argument in 
the proof of Lemma 14. 4p that 



Var(AFj(x) | x^) = E((Afj(x) - E(Aj.,j(x) | Xy)f \ Xy) = 0{n-^) 
and thus, by the Cauchy-Schwarz inequality and (|42p . 

E(|Afj(x) - \F^j{xy)\ I Xy) = 0(n-i/2). 
Consequently, using again that k is bounded, 

E||A(x) - A,J| < E^ |Afj(x) - XfjMI = 0{n-^/^) = o(l), (43) 

so 

||A(x) - A^JI ^ 0. 

Combining ^^ and dU) we see that ¥{D = t) ^ MCPo(A){^} for every £, i.e., 



D -^ MCPo(A), which proves (i) for bounded k 



Next we turn to the proof of (iii) assuming still that k is bounded. Fix a 
number ^ g N, and for v E [n] let Dy be the degree of v in G{n,K), and ly the 
indicator 1 [Dy = i] . 

Fix two distinct vertices v and w, let G be the set of atoms that contain 
both V and w, and let Dy and !)„ be the degrees of the vertices if we delete 
(or ignore) the atoms in G- Since k bounded, the expected number E|t/| of such 
exceptional atoms is 0{n^^), and thus 

¥{Dy ^ Dy), PiD.^ ^ Dy,) < P {Q ^ 0) < E\g\ = O(n-i). 
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Moreover, these bounds hold conditional on x. Furthermore, given x, D„ and 
Du, are independent. Consequently, for any £ € N, 

E(/„/«, I x) = P(i?, = D.^=i\x)= F{D, = 5„ = £ I x) + o(l) 
= F{D, = e I x)P(5„ =e\x)+ o(l) 
= P(D, = £ I x)P(i?„ = £ I x) + o(l) 
= E(/, |x)E(/„ |x)+o(l), 

and thus Cov(/u, Xu, | x) = o(l). Since ne ~ ^^ /i,, it follows that Var(nf | x) = 
o(n^) and thus 

ni=E{ni\x)+Op{n). (44) 

Further, if we write h{x) = CPo(A2.){?} and sum (1^0]) (where I? = £>„) over ii, 
we obtain 

n n n 

E(nf I x)-^/i(x„)| = |^(P(I?„ =£ I x)-/i(x„))| < 0(l)+^EilA(x)-A,J|. 

v—1 v—1 -u— 1 

By (US]), the right-hand side has expectation 0(7^) and thus 

n 

E(nf |x) =^/i(x„)+Op(n). (45) 

v=l 

Now h{xi), . . . , h{xn) are i.i.d. random variables with mean 

E/i(.T,„) = / h{x)dfi{x) = / CPo(A^){?}d^(x) = MCPo(A){/}. 



Hence, by the law of large numbers, - X]"=i f^i^v) — * MCPo(A){/}, which is 
the same as 

n 

^/i(a;„) = MCPo(A){/}ri + Op(n). (46) 

u=l 

The rcsuh ^ follows from (glD, gS]), (gS]). 

Furthermore, ([55)1 says that {n(/n)( -^ MCPo(A) in the space M°° of se- 
quences, equipped with the product topology, which is the same as separate 
convergence of the components. However, it is well-known, and easy to see 
(e.g. by compactness) that restricted to the set of probability distributions, this 
equals the standard topology there. 



We have proved (i) and (iii) for bounded k. For general k, we use truncations: 
define K^f in analogy with (fT^ . setting k^^ = kf A M ioi \F\ < M and k|/ = 
for |F| > M. We use A^''^, n^^ and so on to denote quantities defined for 
G{n,K^'^). For fixed M, applying ([36]) for the bounded kernel family k^ , we 
have nf^ /n -^ MCPo(A*^){£} as n ^ oo, and thus by dominated convergence 

Elnf /n - MCPo(A^^){£}| -^ 0. (47) 
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Furthermore, for every x G S and d > 1, the mtensities X^ {d} converge to Xx{d} 
as M — *■ oo, by ((35)) . ((33)) and monotone convergence. Thus a simple coupHng 

shows that MCPo(A^^) ^ MCPo(A) as M ^ oo. We may couple G{n,K,) and 
G{n,K}^) in the obvious way so that G{n,K) is obtained from G{n,K^^) by 
adding further atoms, say Np copies of each i^ g !F. ThenEA'^^^ < n Jg^p^{KF — 
Kp), and since at most J^f \^\^f vertices are affected by the extra additions, 



E 



n 



<^E^|F|iV|^<^|F| /■ inF-4'). (48) 



The right hand side is independent of n, and tends to as M ^ oo by dominated 
convergence and our assumption that k is integrablc. For any e > 0, we may 
thus choose M so large that the right hand side of ([48]) is less than e, and also 
so that I MCPo(A^^){£} - MCPo(A){^}| < e; then by ^, for large enough n, 

E\ni/n - MCPo(A){^}| < 3£, 



which proves (j36p and thus (iii) Further, p6p and dominated convergence yields 



F{D„ = £) = E{ne/n) -> MCPo(A){£}, which proves (i) 



Finally we prove (ii) (This could also easily be done directly in a fairly 



straightforward way.) First, ((3T|) and ([35]) yield 



E 



MCPo(A) = I Y,\F.,{x)dF{j)d^ji{x) ^Y. Y. '^^(^■) / , ^P^ 



which yields the formula for E MCPo(A) claimed in the theorem, since ^ dF{j) = 
2e(F). 

Next, the convergence in distribution (i) yields (by a version of Fatou's 
Lemma) the inequality liminf„^oo EZ)„ > EMCPo(A). Finally, recalling the 
definition ([57)) of D'^ (denoted D' in ([57)) '). we have D„ < D'^^ and thus 

EA. < Ei?:, = y^dF{mNF.j = v v rffC?) ^"",^,'"^'''^ / ^^ 

F,j FeJ^jev{F) ■^^ 



< V 2e(F) / KF = EMCPo(A), 



yielding the opposite inequality limsup„^g^ E_D„ < EMCPo(A). D 



Part (ii) of Theorem 16.31 is not surprising. Also, since by symmetry EZ)„ = 
-Ee(G(n, k)), it follows from Theorem 11.31 (which we shall not prove until the 
next section). For bounded kernel families, it is easy to sec that also higher mo- 
ments of Dn converge to the corresponding moments of MCPo(A), for example 



by first showing that ED™ = 0(1) for every fixed m and combining this with (i) 
This extends to certain unbounded kernel families, but somewhat surprisingly 
not to all integrable kernel families, as the following example shows. 
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Example 6.4. Let iS = [0, 1) with Lebesgue measure, and regard 5 as a circle 
with the usual metric d{x,y) = min(|a; — y\, 1 — \x — y\). We construct our 
random graph by adding triangles only; thus Kp = for F ^ K^, and we take 

^3(2:, y, z) = d{x, yf^^ + d{x, z)'-^ + d{y, zf'^ (49) 

for some small e > 0, for example e ~ 1/10. Clearly, k is an integrable kernel 
family (and a hyperkerncl). 

Let A = TmiiKi^j<n d{xi, Xj) be the minimal spacing between the n inde- 
pendent uniformly distributed random points Xi, I < i < n. It is well-known 
that this minimal spacing is of order n^^; in fact, it is easy to see that for < s < 
1/n we have P(A > s) = (1 — sn)"^^ < e~'*"("~i), and in particular A < n^~^ 
whp. Hence there exist whp two distinct indices i and j with d{xi, Xj) < n^^^, 
and thus, for large n and every Xfc, K3(a;i,Xj,Xfc) > n^'^^^')^'^^'^') > 2n'^^-^^ . Ifiand 
j are chosen such that this holds, then from ([1]) we have p{i,j, k; K3) > 2n~^^ 
for all k y^ i,j, and thus the number of k such that the triangle ijk is an atom 
stochastically dominates the binomial distribution Bi(n — 2,2n~^'^); hence this 
number is whp at least n^^'^^ . 

We have shown that whp there are at least two vertices i and j with degrees 
> ni-3^, and thus, for large n, F{D„ > n^-^^) > (l-o(l))f > i. Consequently, 
for large n, 

n 
On the other hand, for some finite c = f^,, K3, by symmetry, Xn^jix) = c 
and Xx = 3cS2. Hence MCPo(A) = CPo{3cS2), which is the distribution of 2X 
with X ~ Po(3c), which has all moments finite. 

As we shall see in Theorem 17.41 this situation cannot arise in the edge-only 
version of the model, i.e., the model in [TU]; in the terminology of the next 
section, all copies of P2 are then 'regular'. 

In Section [8] we shall illustrate Theorem 16.31 by giving a natural family of 
examples with degree distributions with power-law tails. 

7 Small subgraphs 

In this section we turn to the final general property of G(ri, k.) we shall study, 
the asymptotic number of copies of a fixed graph F in G{n,K)] throughout this 
section, k denotes a kernel family (KF)p^jr^ rather than a hyperkerncl. Wc work 
with the multi-graph version of the model. 

Although mathematically not as interesting as the phase transition, the num- 
ber of small graphs in G{n, k) is important as it is directly related to the original 
motivation for the model. Indeed, recall that perhaps the main defect of the 
model of [TU], i.e., the edge-only case of the present model, is that it produces 
graphs with very few (usually Op(l)) triangles, i.e., graphs with clustering co- 
efficients that are essentially zero. This contrasts strongly with many of the 
real-world networks wc wish to model. 
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The simplest way that a copy of some graph F may arise in G{n, k) is as an 
atom. The expected number of such copies is simply 

^^'■^'^ ' KF <n I Kp. (50) 



si^ 



The next simplest way that a copy of F may arise is as a subgraph of some 
atom F' of G{n,n). Let us call such copies of F direct; we include the case 
F' = F. Let n(F, F') denote the number of subgraphs of F' isomorphic to F^ 
so n{K3, K4) = 4, for example. Set 






F'£J= 



and let n(i{F, G{n, k)) denote the number of direct copies of F in G{n, k). Then 
from (l50l) wc sec that 



End{F,G{n,K)) < ii{F,K)n, 
and that if k is bounded, then 

End{F,G{n,K)) ^ ii{F,K)n + 0{1). 

The reason for the somewhat peculiar notation ii is as follows: the subscript 
1 indicates direct copies (arising from only one atom). The tilde will be useful 
later to differentiate from standard notation t(F, k) in other contexts. 

It will turn out that in well behaved cases (for example for all bounded 
kernel families), essentially all copies of any 2-connected graph F in G{n,K) 
arise directly. Unfortunately, this is not the case for general F. Perhaps the 
main special cases we are interested in are stars; the number of copies of the 
star K12 (i.e., the path P2) is needed to calculate the clustering coefficient, 
for example. Note that the number of copies of the star Kik {k > 2) in any 
graph G is simply |G|/fc! times the fcth factorial moment of the degree of a 
random vertex; hence counting stars is closely related to studying the degree 
distribution, which we did for G{n,K) in Section [B] 

Let us say that a copy of F in G{n, k) arises indirectly if it contains edges 
of at least two of the atoms making up G{n,K,). To understand the expected 
number of such copies we first need to understand the probability that a certain 
set of vertices form a copy of F given the types of the vertices. More precisely, 
we consider the expectation of the number of copies of F with a given vertex 
set, even though this number is highly unlikely to exceed 1. 

Let F be a connected graph with r vertices. Let emb(F, F') denote the 
number of embedding s oi F into F' , i.e., the number of injective homomorphisms 
from F to F', so emb(F,F') = n{F,F')'A\it{F). Fixing a labelling of F, let 
X^{G) denote the number of copies of F in a multigraph G with vertex i of 
F corresponding to vertex i of G. (Thus X^iG) is or 1 if G is simple.) The 
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contribution to 'EXp{G{'n, k) \ xi, . . . ,Xr) from copies of F arising as subgraphs 
of atoms isomorphic to a given F' with r' vertices is exactly 

^ ' n Sr'-r 

if-.F^F' •'^ 

where (p runs over all emb(i^, F') embeddings of F into F', we take yj = Xi if 
ip(i) = j, and we integrate over the remaining r' ~ r variables yj. 
Set 



,yr')- (51) 



aF(,xi,...,Xr) = (TF(,xi,...,Xr;K) ^'^ ^ / ^ Kp' {yi 

F' if-.F^F' ■^^'''^'' 

Then we have 

E{X^{G{n,K)) I xi,...,Xr) <n-'^'-^^aFixi,...,Xr;!i), (52) 

and if k is bounded then the relative error is 0(n~^). 

Comparing ([5T|) and ([4]), note that if F = K2, then <tj? = Ko- Before con- 
tinuing, let us comment on the normalization. Recall that in defining G{n,K), 
we consider all r! possible ways of adding a (labelled) copy of F on vertex set 
{1, 2, . . . , r}, say, adding each copy with probability Kp{xi, . . . , Xr)/n^^^ ■ This 
means that the contribution from Kp to Xp(G{n, k)) is aMt{F)Kp(xi, . . . , Xr)/n^~ 
and, correspondingly, the contribution from Kp to ap is axit{F)Kp{xi, . . . ,Xr)- 
In other words, while having the same form as a kernel, up is normalized dif- 
ferently. This situation arises already in the edge-only case, where Ke{x,y) = 
2K2{x,y). It turns out that here the normalization of ap, giving directly the 
probability that a certain set of edges forming a copy of F is present, is the 
natural one. Note that if we had used this normalization from the beginning, 
then formulae such as (|50|) would have extra factors. 

Let F be a connected graph with vertex set [r] . We say that a set Fi , . . . , F^ 
of connected graphs forms a tree decomposition of F if each Fi is connected, the 
union of the Fi is exactly F, any two of the Fi share at most one vertex, and 
the Fi intersect in a tree-like structure. The last condition may be expressed by 
saying that the Fi may be ordered so that each Fj other than the first meets the 
union of the previous ones in exactly one vertex. Equivalently, the intersection 
is tree-like if |F| = 1 + X^id^*! ^ !)■ Equivalently, defining (as usual) a block 
of a graph G to be either a maximal 2-connected subgraph of G or a bridge in 
G, Fi , . . . , Fa forms a tree composition of F if each Fi is a connected union of 
one or more blocks of F, with each block contained in exactly one Fi. (Cf. [H 
p. 74].) 

Note that we allow a = 1, in which case Fi = F. For a > 2, the order 
of the factors is irrelevant, so, for example, Ki^2 has a unique non-trivial tree 
decomposition, into two edges. Note also that if F is 2-connected, then it has 
only the trivial tree decomposition. 

Let us say that a copy of F in G{n, k) if regular if it is the union of graphs 
Fi,...,Fa forming a tree decomposition of F, where each Fi arises directly 
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as a subgraph of some atom i^', and V{FI) n F(Fj) = V{F,) n V(Fj) for all 
i ^ j (with this intersection containing at most one vertex). We can write down 
exactly the probability that G(n, k) contains a regular copy of F with vertex 
set 1 , . . . , r in terms of certain integrals of products of conditional expectations 
E,{Xp, (G) I xi, . . . ,Xs)- We shall not do so. Instead, let 

t{F, k) ~ y^ / apiCTp^ ■ ■ ■ ap^ diJ.{xi) ■ ■ ■ dfi{xr), (53) 

where the sum runs over all tree decompositions of F and each term api is 
evaluated at the subset of xi, . . . ,x,- corresponding to the vertices of Fi C F, 
and set 

t(F,K) = aut(F)-H(F,K). (54) 

Note that these definitions extend to disconnected graphs F, taking the sum 
over all combinations of one tree decomposition for each component of F. 

The upper bound (j52p easily implies that the expected number of regular 
copies of F in G{n, k) is at most t(F, Kjn, and furthermore this bound is correct 
within a factor l + 0{n^^) if k is bounded; the factor aut(F)^^ appears because 
there are n(r)/ aut(F) potential copies of F. Note that the number emb(i^, G) of 
embcddings of a graph F into a graph G, i.e., the number of injcctive homomor- 
phisms from F to G, is simply aut(F)n(F, G). Hence t{F, k) is the appropriate 
normalization for counting embcddings of F into G(n, k) rather than copies of 
F. In other contexts, when dealing with dense graphs, it turns out to be most 
natural to consider homomorphisms from F to G, the number of which will be 
very close to emb(i^, G). Thus the normalization in (|53|1 is standard in related 
contexts. (See, for example, Lovasz and Szegedy [34].) 

Let us illustrate the definitions above with two simple examples. 

Example 7.1. The simplest case is F = K2. In this case, there is only the 
trivial tree decomposition, and ([55)) and ([5^ yield 

iiK2,K) = - f^K2ix,y) = - Ko(a;,y) =^(k). (55) 

Example 7.2. Suppose that k contains only two non-zero kernels, K2, corre- 
sponding to an edge, and K3, corresponding to a triangle; our aim is to calculate 
t{P2,K) in this case, where P2 is the path with 2 edges. Using symmetry of K2 
and K3, 

<7K, (x, y) = 2^2(2;, y) + 6 K3{x, y, z) d^i{z), (56) 

Js 

while 

ap^{x,y,z) = 6K3{x,y,z), (57) 

reflecting the fact the P2 ijk appears directly in G(n, k) if and only if we added 
a triangle with vertex set {i, j, k}, and this vertex set corresponds to 6 3-tuples. 
Since aut(P2) = 2, it follows that 

i{P2,&) = o / (o'P2(2;,y, 2) +aKAx^v)'^K2{y,z))d^j,{x)d^x{y)diJL{z). 
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More generally, let F be any (simple) subgraph of G„ = G{n,K) with k 
components. (We abuse notation by now writing F for a specific subgraph of 
Gn, rather than an isomorphism class of graphs.) Let F[, . . . ,F'^ list all atoms 
contributing edges of F, and let Fi = F/ n F, where we take the intersection 
in the multigraph sense, i.e., intersect the edge sets. For example, if ei and 
62 are parallel edges in G„ forming a double edge from i to j, and ei G E{F), 
62 S E{F[), then Fi = F[r)F contains no ij edge, even though F[ and F each do 
so. By definition each Fi contains at least one edge, and F is the edge-disjoint 
union of the F,;. Since F has k components, when adding the Fi one by one, 
at least a — k times a new component is not created, so at least a ^ k times at 
least one vertex of Fi, and hence of F/, is repeated. It follows that 

E(i^/i-i)^|U^i-^- (58) 

i i 

Extending our earlier definition, we call F regular if equality holds in (|58p . and 
exceptional otherwise. Note that if any F,; is disconnected, then F is exceptional. 
Let n,-{F,Gn) denote the number of regular copies of F in G„ = G{n,K), 
and nx(F, G„) the number of exceptional copies. 

Theorem 7.3. Let Gn ~ G{n,K), where k, is a kernel family, and let F be a 
graph with k components. Then 



If K is bounded, then 



En,{F,Gn) <nH(F,K). 

IEn,(F,G„)-0(n^-i), 
Var(n(F,G„))==0(n2'=-i), 



and 

n(F, G„) = n..(F, G„) + n,(F, G„) = n'=t(F,s)(l + Op(n-i/2)). 

Proof. We have essentially given the proof of the first statement, so let us just 
outline it. To construct a regular copy of F in G„ we must first choose graphs 
Fi, . . . , Fq on V{F) forming a tree decomposition of each component of F. Then 
we must choose a graph F/ containing each Fi to be the atom that will contain 
Fi. Then we must choose s = | Ui ^/l distinct vertices wi, . . . , Ug from 1, . . . , n to 
be the vertices of the F/, where (since F is regular), we have s = k+'Y^^(\Fl\ — l). 

Note that there are ri,(s) < n^ choices for the vertices Vi. (We are glossing 
over the details of the counting, and in particular various factors aut(_ff) for 
various graphs H. It should be clear comparing the definition of t{F,K) with 
what follows that these are in the end accounted for correctly.) 

Given the vertex types, the probability that these particular graphs F/ 
arise is then (up to certain factors aut(F/)) a product of factors of the form 
Kp> /n'^' 1^^, where the kernel is evaluated at an appropriate subset of a;^ , . . . , x^^ ■ 
Note that the overall power of n in the denominator is X^id^/ 1 ^1) = s — k. 
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Integrating out over the variables Xj corresponding to V{Fl) \V{Fi), and sum- 
ming over all F^ D Fi, the factor Kp; becomes a factor ap- Finally, integrating 
out over the remaining variables, corresponding to vertices of F, and summing 
over decompositions, we obtain n^i{F^ k) as an upper bound. 

If K is bounded then the number s of vertices appearing above is bounded, 
so n^s)/n'^ = 1 — 0(n^^), where the error term is uniform over all choices for 
F{, . . . , F^. It follows that in this case, 

En,{F, G„) = i{F,K)n''{l - 0{n-^)). 

Arguing similarly for exceptional copies, the power X^jd^/I ^ 1) '^^ '^ ^^^ the 
denominator is now at least s — fc + 1, and it follows that if k is bounded, then 
En^{F, Gn) = 0{n'^~^) as claimed. It follows that 

En(F, G„) = i{F,K)n'' + 0{n''-^). (59) 

Finally, for the variance we simply note that En(F, Gn)^ is the expected 
number of ordered pairs (Fi, F2) of not necessarily disjoint copies of F in G„. 
If Fi and F2 share one or more vertices, then Fi U F2 has at most 2fc — 1 
components. From ([55]) . the expected number of such pairs is 0{n^''~^). The 
expected number of pairs with Fi and F2 disjoint is simply N'En(2F^ G„); where 
2F is the disjoint union of two copies of F and N = aut(2F)/aut(F)^ is a 
symmetry factor, the number of ways 2F can be divided into 2 copies of F. (If 
F is connected then simply N = 2 and in general, if F has distinct components 
Fj with multiplicities nij, then N = JJ. (^":^).) Since t{2F, k) = t{F,K)^, we 

have i{2F,K) = i{F,K)'^/N, so ^ gives 

from which the variance bound follows. The final bound follows by Chcbyshev's 
inequality. D 



For bounded kernel families, Theorem l7.3l is more or less the end of the story, 
although one can of course prove more precise results. For unbounded kernel 
families the situation is much more complicated. Let us first note that regular 
copies of F do not give rise to any problems. 

Theorem 7.4. Let k be a kernel family and F a connected graph, and let On = 
G{n,K). Then n^(F,Gn)/n —^ t{F, k) < 00. In other words, if t{F, k) = 00, 
then for any constant G, whp n^{F,Gn) > Gn, while if t(F, k) < 00, then 
n^{F,Gn) = i{F, K)n + Op{n) . 

Proof. We consider the truncated kernel families k^^ . Since t{F,K) is a sum 
of integrals of products of sums of integrals of the kernels npi, by monotone 
convergence we have t{F, k}^) -^ t{F, g) < 00 as i\/ ^ 00, and hence t{F, k^^) -^ 

i{F,K). 

If t(F, k) = 00, choose M so that t(F, k^^) > G, and couple G'^ = G{n, k^) 
and G„ in the natural way so that GJ^ C G„. Since k^^ is bounded, Thcorcm l7.3l 
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implies that ni{F,G'„) > Cn whp. Since n^{F,Gn) > n^{F,G'j^), the result 
follows. 

If t{F, k) < oo, then given e > 0, the truncation argument above shows that 
nr(F, G„) > {i{F,K) — e)n holds whp. By the first statement of Theorem 17.31 
Enr(F, Gn) < t{F, K)n. Combining these two bounds gives the result. D 

Note that wc do not directly control the variance of nr{F, Gn)', as we shall 
see in Section [SJ there are natural examples where n,-{F, Gn)/n is concentrated 
about its finite mean even though its variance tends to infinity. 

The very simplest case of Theorem 17.41 concerns edges; we stated this as a 
separate result in the introduction. 

Proof of Theorem \1.3[ Since all copies of A'2 in G„ ~ G{n,K) are regular (and 
direct), e{Gn) = n{K2,Gn) = nr{K2,Gn), and taking F = K2 in Theo- 
rem 17.41 and using ([55)1 yields e(Gn)/n -^ £,{&), which is the first claim of 
Theorem 11.31 It remains to show that Ee(G„) = En^{K2,Gn) -^ C(s)- The 
lower bound follows from the first part, since convergence in probability implies 
liminfn^oo Ee(Gn)/n > £,{k), while Theorem 17.31 gives Ee(Gn)/n < t{K2,K) = 
^(k), completing the proof. D 



It is also easy to prove Theorem 11.31 directly, using truncations as in this 
section but avoiding many complications present in the general case. 

By a moment of a kernel family k we shall mean any integral of the form 

KF1KF2 ■ ■ ■ i^Fr d^{xi) ■ ■ ■ d^{xd), 

where Fi, . . . ,Fr are not necessarily distinct, and each term npi is evaluated at 
some I Fi I -tuple of distinct Xj. The proof of Theorem 17.31 shows that for any 
connected -F, Enx(-F, G(n, k)) is bounded by a sum of moments of k. This gives 
a very strong condition under which wc can control n^iF, G{n^K,)). 

Theorem 7.5. Let k be a kernel family in which only finitely many kernels 
Kp are non-zero. Suppose also that all moments of k are finite. Then for any 
connected F , Enx(F, G{n, k)) — 0(1), and the conclusions of Theorem \7.4\ apply 
with n^{F, Gn) replaced by n{F, G„). 



Proof. This is essentially trivial from the comments above and Theorem 17.41 
We omit the details. D 

Examplc l6.4l shows that some conditions are necessary to control ri,x(F, G{n,K)); 
we refer the reader to ((49|) for the description of the kernel family in this case. 
Plugging ([ig|) into ((5^ . in this case we have aK^i^iV) = ^d{x,yY^^ + a for 
some constant a (in fact, a = 24£~^2^^), and it easily follows that i(P2, s) < 00. 
However, as shown in the discussion of that example, whp there is a vertex with 
degree at least n}~^'^ , and hence at least n'^~^^ copies of P2, which is much larger 
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than 71 if £ < 1/6. In this case the problem is exceptional P2S ijk arising from 
atoms ij£ and jk£: the corresponding moment 

is infinite, due to the contribution from d(x2,X4)'^^~'^. 

Of course, not all moments contribute to Enx(-F, G„); as we shall see in the 
next section, it is easy to obtain results similar to Theorem 17.51 under weaker 
assumptions in special cases. Also, in general it may happen that nx{F, G„) has 
infinite expectation (in the multigraph form), but is nonetheless often small, i.e., 
that the large expectation comes from the small probability of having a vertex 
in very many copies of F. Much more generally, it turns out that when k, is 
integrable, whp all exceptional copies of F sit on a rather small set of vertices. 

Theorem 7.6. Let k be an integrable kernel family and F a connected graph, 
with t(F, k) finite. Let Gn = G{n,K). 

For any £ > 0, there is a 6 > such that whp every graph GJj formed from 
Gn by deleting at most Sn vertices has n{F,G',-^) > {t{F, k) — e)n. 

For any £ > and any 5 > 0, whp there is some graph GJj formed from G„ 
by deleting at most Sn vertices such that n{F,G'n) < {t{F,K) + e)n. 

Together the statements above may be taken as saying that G„ contains 
essentially {i{F, k) + Op(l))n copies of F, where 'essentially' means that we may 
ignore o{n) vertices. In other words, the 'bulk' of G„ contains this many copies 
of F, though a few exceptional vertices may meet many more copies. 

Proof. We start with the second statement, since it is more or less immediate. 
Indeed, writing J k for J2FeJ^ \-^\ /si^i '*^' ^^'^ considering truncations k^^ as 
usual, from monotone convergence we have J k^ /^ J k as M -^ oo. Let £ > 0, 
^ > and 77 > be given. Since k is integrable, i.e., J k < 00, there is some M 
such that Jk^'>Jk- Srj/2. Coupling Gf = G(n, k*^ and G„ = G(n, k) in 
the usual way, let us call a vertex bad if it meets an atom present in G„ but not 
G*^. The expected number of bad vertices is at most the expected sum of the 
sizes of the extra atoms, which is at most n{J k — j k.^'^) < 5rin/2. Hence the 
probability that there are more than 5n bad vertices is at most 77/2. 

Deleting all bad vertices from G„ leaves a graph G[^ with at most n(F, G^^) 
copies of F. Applying Theorem 17.31 this number is at most {i{F^ k^^) + e)n < 
(t(i^, k) + e)n whp, so we see that if n is large enough, then with probability 
at least 1 — rj we may delete at most Sn vertices to leave G^ with at most 
(t(F, k) + e)n copies of F, as required. 

Turning to the first statement, we may assume without loss of generality that 
K is bounded. Indeed, there is some truncation k}^ with i{F,K^-') > i{F,K) — 
e/2, and taking G{n,K^^) C G{n,K) as usual, it suffices to prove the same 
statement for G{n,K^'^) with e replaced by e/2. Assuming k is bounded, then 
by Theorem 17.31 we have n{F,K) > {t{F, k) — e/2)n whp, so it suffices to prove 
that if K is bounded and £ > 0, then there is some 6 > such that whp any 6n 
vertices of G„ = G(n, k) meet at most £77, copies of F. 
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Let w be a fixed vertex of F, and for 1 < i < n let a^ denote the number 
of homomorphisms from F to Gn mapping v to vertex i. Let F' be the graph 
formed from two copies of F meeting only at v. Then there are exactly a? 
homomorphisms from F' to Gn mapping i; to i, so in total there are ^ a? 
homomorphisms from F' to G„. Now the image of any homomorphism from F' 
to Gn is a connected subgraph F" of G„, and each such subgraph is the image 
of 0(1) homomorphisms. Applying Theorem 17.31 to each of the 0(1) possible 
isomorphism types of F", it follows that there is some constant C such that, 
wlip, 

^ a^ = hom(F', Gn) < Gn. 

i 

When the upper bound holds, given any set S C [n] with \S\ < Sn, by the 
Cauchy-Schwartz inequality we have 




< VSnVCn = VCSn. 



Repeating the argument above for each vertex v of F and summing, we 
see that there is some C" < oo (given by the sum of at most |F| constants 
corresponding to \/C above) such that whp for any S > 0, and any set S of at 
most Sn vertices of G„, there are at most C'vSn homomorphisms from F to G„ 
mapping any vertex of F into S. This condition implies that S meets at most 
G'VSn copies of F, so choosing S such that C'VS < e, we see that whp any Sn 
vertices meet at most en copies of F. As noted above, the first statement of the 
theorem follows. D 



8 A power-law graph with clustering 

Our aim in this paper has been to introduce a very general family of sparse 
random graph models, showing that despite the generality, the models are still 
susceptible to mathematical analysis. The question of which special cases of the 
model may be relevant in applications is a very broad one, and not our focus. 
Nevertheless, in the light of the motivation of the model, we shall investigate 
one special case. We should like to show that, with an appropriate choice of 
kernel family, our model gives rise to graphs with power-law degree distributions, 
with various ranges of the degree exponent, the clustering coefficient (sec (|57)) ). 
and the mixing coefficient (see (|70p ). We achieve this in the simplest possible 
way, considering a 'rank 1' version of the model in which we add only edges 
and triangles. We do not claim that this particular model is appropriate for 
any particular real-world example; nevertheless, it shows the potential of our 
model to produce graphs that are similar to real- world graphs, where similarity 
is measured by the values of these important and much studied parameters. 

Throughout this section we fix three parameters, a > 1, and A,B>0 
with A + B > 0. We consider one specific kernel family k on S = (0, 1] with 
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H Lebesgue measure. Our kernel family has only two non-zero kernels, K2, 
corresponding to edges, and K3 to triangles, with 

and 

We could of course consider many other possible functions, but these seem the 
simplest and most natural for our purposes. It would be straightforward to 
carry out computations such as those that follow with each of the as above 
replaced by a different constant, for example, although we should symmetrize 
the kernels in this case. However, one of these exponents would determine the 
power law, and it seems most natural to take them all equal. 
For convenience, we define 

/3fc= /\-'^/"da;=|^' "^^' (60) 

Jo [00, a < k. 

In particular, /3i = a/{a — 1). We then have 

/ K2 = Af3l and / K3 == Bf3f, 

so K is integrable. Also, for the asymptotic edge density in Theorem 11.31 

^(k)= / K2 + 3/ K:i=Apl + 2,BPl. (61) 

In the following subsections we apply our general results to determine various 
characteristics of this particular random graph G„ = G(n, k). 

8.1 Degree distribution 

From (|33p and symmetry of K2 arid K3 we see that 

Js 
while for j = 1,2,3, 

Since an edge contributes 1 to the degree of each cndvcrtex, while a triangle 
contributes 2 to the degrees of its vertices, for each x, the measure Xx defined 
by (|34p is given by 

A^ = 2Af3ix-^/°'5i+3B(3fx-'^/"62. 
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Theorem l6.3l then tells us that the degree distribution of G„ = G(n, k) converges 
to the mixed compound Poisson distribution MCPo( A) , where A is the random 
measure corresponding to A^, with x chosen uniformly from (0, 1]. 

Note that if -B = 0, then the limiting degree distribution is mixed Poisson, 
while ii A = 0, almost all degrees are even and the degrees divided by 2 have a 
mixed Poisson distribution. 

For the power law, note that the mean A(a;) of A^ is simply 

X{x) = {2A/3i + QBPDx-^/"' = cx-i/", 

where < c = 2A/3i + QB01 = 2^(k)//3i < oo is a constant depending on A, B 
and a. Choosing x randomly from (0, 1], for any fc > c wc have 

P(A(a;) > fc) = P(x < (fc/c)-") = (fc/c)"", 

so the distribution of X{x) has a power-law tail. Using the concentration prop- 
erties of Poisson distributions with large means, arguing as in the proof of 
Corollary 13.1 of [1^, it follows easily that 

P(MCPo(A) > fc) - (fc/c)-" 

as fc — > oo, so the asymptotic degree distribution docs indeed have a power-law 
tail with (cumulative) exponent a. 

Let dk = P(MCPo(A) = fc), so by Theorem 16. 3[ the asymptotic fraction of 
vertices with degree fc is simply dk- If A > then it is not hard to check that 
in fact 

dk - c'fc-"-^ (62) 

as fc — > oo, where < c' = ac" < oo, so the degree distribution is power-law in 
this stronger sense. If A = 0, then dk = if fc is odd, but ((62)) still holds for 
even fc, for a different (doubled) constant c'. 

8.2 The phase transition and the giant component 

From ([5]), we have Ka{x,y) = {2A + 6BPi)x^^^°'y^'^^°', which we may rewrite 
as Kf,[x,y) = ip{x)ipiy), where 

i;ix) = {2A + 6BPif/^x-^/°'. 

By Theorems 11.51 and II . 7[ the largest component of G„ is of size p{K,)n + Op(n), 
and there is a giant component, i.e., p{k) > 0, if and only if ||Tk^|| > 1. In this 
case Kc is 'rank 1' in the terminology of llOj. and we have 

\\T,J = M\l = i^A + 6Bp,)f32. 

Hence, fixing a > 2 and thus Pi and /32, there is a giant component if and only 
if 

2A + 6Ba/{a-l) > {a~2)/a. (63) 
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Turning to the normalized size p{k) of the giant component, Theorem 
allows us to calculate this in terms of the solution to a functional equation. 
Usually this is intractable, but for the special k we are considering this simplifies 
greatly, as in the rank 1 case of the edge-only model; see Section 16.4 of [TU], or 
Section 6.2 of [3S]. Indeed, writing p{x) for the survival probability of Xk(x), 
from ([7]) we have 

Jo 

[ 3Bx~'/^y~'/^z-'/"{p{y) + p{z) - p{y)p{z))dydz, 
10 Jo 

which simplifies to 

S^{p){x) = .T-i/"(2AC + 6B/3iC - 3BC^), 

where 

C= / x-^/°'p{x)dx. (64) 

By Lemma ?H\ we have p{x) = 1 — exp{—Si^{p){x)), so 

p{x) = 1 - cxp(-(2.4C + 6B(3iC - 3BC'^)x-^/°'). (65) 

Although we defined C in terms of p, we can view C as an unknown constant, 
define p by (p5)) . and substitute back into (|M|) . The function p then solves ^ 
if and only if C solves 

C = /" x-^/°' (l - cxp(-((2A + 6B(3i)C - 3BC^)x-^/°')^ , (66) 

and every solution to ([8]) arises in this way. In particular, by Theorems 12.41 
and II. 7[ there is a positive solution only in the supercritical case (when ([63]) 
holds), and that solution is then unique; C = is always a solution. Trans- 
forming the integral using the substitution y = x^^'", one can rewrite the right 
hand side of (|66p in terms of an incomplete gamma function, although it is not 
clear this is informative. The point is that the form of p{x) is given by (|65p . 
and the constant can in principle be found as the solution to an equation, and 
can very easily be found numerically for given values of A, B and a. 

8.3 Subgraph densities 

In the following subsections we shall need expressions for t{F,K) for various 
small graphs F, where f(F, k), defined by ([55)1 and ([5i)) . may be thought of as 
the asymptotic density of copies of F in the kernel family k. 

We start with direct copies of F. Since all atoms are edges or triangles, the 
only graphs F that can be produced directly are edges, triangles, and P2S, i.e., 
paths with 2 edges. 



Putting the specific kernels K2 and K3 into the formulae (|56p and ([F7|) from 
the previous section, we have 

and 

while 



Edges may be formed only directly, so either from ([55)1 and ([M]) or from 
(|55|) . we have 

f (/\2, &)=2 '^^^ (^' 2/) ^^(^) '^/'(2^) = ^^? + 3^'^?' 

which agrees, as it should, with ([CT|) . Since a triangle is 2-connected, it has no 
non-trivial tree decomposition, and ([55]) and ([51)1 give 

*(A'3,js)==c / Qi^3ix,y,z)d^i{x)dn{y)dfi{z) = Bf3f, 
6 J53 

which may also be seen by noting that the only regular copies of a triangle are 
those directly corresponding to K3. 

A copy of P2 may be formed by a single triangular atom (a direct copy), but 
may also be formed by two edges from different atoms. Hence, as in Example 1 7. 2 1, 

KP21&) = 2 / ('^P^^^^y^ ^^ '^ '^i<-2ix^y)'^K2iy, z)) dn{x) d^i{y) dii{z) 

= i (6Bf3f+ fi2A + 6B(3ifx-^^"y-^/°'z-^^"dti{x)dfi{y)dn{z) 

In particular, if a < 2 then i(P2,£) is infinite. 

For 5*3 = Ki^3, the star with three edges, there are two types of tree- 
decompositions: three edges or one edge and one copy of P2 , the latter occurring 
in 3 different ways. (There are no direct copies.) Hence, 

t{S3,K) = / aK2ixi,X2)crK2ixi,X3)aK2{xi,Xi)+3 / ap^{x2,Xi,X3)aK2{xi,X4) 

= {2A + QB(3if(3l(33 + 1%B{2A + 6Bpi)/3f(32 
and thus 

i{S3,K) = ^(2A + 6B/3i)3/33/33 + 3B{2A + 6B/?i)/3?/32. 
6 
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Finally, for P3, there are again two types of tree-decompositions: three edges 
or one edge and one copy of P2, the latter now occurring in 2 different ways. 
Hence, 

i{P3,K) = 2 / '^K2{xi,X2)<TK2{x2,X3)aK2{x3,X4) + - / ap^ixi, X2, X3)aK2{x3, X4) 

= i(2A + 6B(3ifpff3l + 6B{2A + 6B(3i)(3l(32. 

As we shall see, the counts above are enough to calculate two more interesting 
parameters of the graph Gn = G{n, k). 

8.4 The clustering coefficient 

The clustering coefficient C{G) of a graph G was introduced by Watts and 
Strogatz |37| as a measure of the extent to which neighbours of a random vertex 
in G tend to be joined directly to each other. After the degree distribution, it is 
one of the most studied parameters of real- world networks. As discussed in |13| . 
for example, there are several different definitions of such clustering coefficients. 
One of these turns out to be most convenient for mathematical analysis, and is 
also very natural; following |T3], we call this coefficient C2(G). (Hopefully there 
will be no confusion with our earlier use of 6*2(6*) for the number of vertices in 
the 2nd largest component.) The coefficient 6*2(6*) may be defined as a certain 
weighted average of the 'local clustering coefficients' at individual vertices, but 
is also simply given by 

_ 3n(A-3, G) 

^'^^^ - n(P2,G) ' ^^^) 

a ratio that is easily seen to lie between and 1. 

Now from above we have t^K^^K) = B01 < 00. Hence, by Theorem 17.41 

n,{K3,Gn) =Bf3fn + 0p{n), 

where, as usual, 6*„ = G{n,K). We shall return to exceptional copies of K3 
shortly. 

If a < 2 then t{P2,K) is inffiritc, and 6*„ will whp contain more than 0{n) 
copies of P2 ■ Note that this is to be expected given the exponent of the asymp- 
totic degree distribution, since in this case the expected square degree is infinite. 

From now on we suppose that a > 2, so t{P2,K) is finite. Suppose for the 
moment that exceptional copies of P2 and K3 are negligible, i.e., that 

nx(P2, 6'„), n,(A'3, G„) = Op(n). (68) 

By Theorem 17. 4[ we have n^{P2^ G„)/n ~ i{P2, k) + Op(l) and n,:{K3, G„)/n = 
t{K3, k) + Op(l), so it follows that 

G2(G„) = ^1^^ +Op(l) = c2(A,B,a) -f Op(l) 



tiP2,&) 
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where, from the formulae in Subsection 



3B(3f 
3Bf3'f + 2{A + 3B/3i)^/3'if32 ' 



ciAB^a)^ ^^^, ,o.. ,L.,2,2, ^ (69) 



with /3i, /32 given by ((60)) . It follows that with the degree exponent a > 2 fixed, 
this special case of our model can achieve any possible value of the clustering 
coefficient, with the trivial exception of 1 (achieved only by graphs that are 
vertex disjoint unions of cliques). Indeed, C2(A,0, a) ~ for any A, while 
taking A = we have 

02(0, B, a) = , 

which is decreasing as a function of B, and tends to 1 as _B — ^ and to as 
B ~*oo. 

Let us note in passing that by Theorem 1 7. 4[ if 2 < a < 4 then n^{P2^ Gn)/n 
is concentrated around its finite mean even though its variance, which involves 
the expected 4th power of the degree of a random vertex, tends to infinity. 

So far we considered only regular copies of P2 and K3; we now turn our 
attention to exceptional copies. Unfortunately, for any a, some moment of our 
kernel is infinite, so Theorem [73] does not apply. However, it is easy to describe 
the set of moments relevant to the calculation of Enx{F, G„) for the graphs F 
we consider. 

Suppose that F is an exceptional triangle (or P2; the argument is then 
almost identical) in G„ ~ G{n,K). Since F has (at most) three edges, there are 
at most 3 atoms Fi contributing edges to F. Let H be the union of these atoms, 
considered as a multigraph. For example, if F is the triangle abc, then H might 
consist of the union of the three triangles abd, bed, and cad. In some sense this 
will turn out to be the 'worst' case. 

Let us fix the isomorphism type of H, defined in the obvious way. Let h 
be the total number of vertices in H, and write r = X^id-^il — I) — {h — 1) for 
the 'redundancy' of H. Since F is exceptional, r > 1. The expected number 
of exceptional F arising in this way is exactly n(^i^\Ti~^i^'^''~^' times a certain 
integral of products of K2 and K3. From the form of k,2 and K3, we may write 
this as 

rl'h-i / a^r"'^" • • • ^h"''^" Mxi) ■ ■ ■ dfiixh), 

where rii is the number of the atoms Fj that contain the ith vertex of H. The 
initial factor is at most n^~^ < 1, while the integral is finite unless rii > a for 
some i. Since H is made up of at most 3 atoms Fj, we always have n^ < 3, so 
if a > 3 then the relevant integrals (i.e., the relevant moments) are finite, and 
we have ¥.n^{K^, Gn), En^{P2, G„) = 0(1), which certainly implies ([55]) . 

In fact, we do not need to assume that a > 3. Suppose that 2 < a < 3. 
Then in the multigraph version of the model, Kn-x^(K^,Gn) = c)o. (Consider, 
for example, three triangles sitting on 4 vertices as above.) On the other hand, 
this does not mean that n^{K^, Gn) is often large. Indeed, when we choose our 
vertex types uniformly from (0, 1], whp there is no vertex whose type x is at 
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most 5 = l/(nlogn), say. Conditioning on this very likely event A, we may 
consider the restrictions of K2 and K3 to {6, 1]^ and {6, 1]^, respectively. Now the 
expected number of copies of some pattern H is at most a constant times 



n 



1-'^ / a;-3/" dx 



where s is the number of vertices i oi H with n^ = 3. Since the graph K3 (or 
P2) we are trying to form has maximum degree 2, every vertex of H with n, = 3 
corresponds to a redundancy, so we always have r > s. Up to constants and a 
power of log 71 the integral is n*^^"")/" < -^n, and it follows that 

EK(i^3,Gn) I A) = O(V^) = o{n). 

Since V{A) = 1 — o(l), it follows that n^{K^,Gn) = Op(n), even though its 
expectation would not suggest this. The same holds for nx(P2,G„), so we see 
that (p5|) does indeed hold for any a > 2, and the clustering coefficient is indeed 
concentrated about C2(A, _B,a). 

8.5 The mixing coefficient 

Another interesting parameter of real networks is the extent to which the degrees 
of the two ends of a randomly chosen edge tend to correlate; positive correlation 
is known as assortative mixing, and negative correlation as disassortative mixing. 
To define this precisely, let G be any graph, and let vw be an edge of G chosen 
uniformly at random. More precisely, let {v,w) be chosen uniformly at random 
from all 2e(G) ordered pairs corresponding to edges of G. Let D^ and D^, 
denote the degrees of v and w] we view these as random variables. Since the 
events {v = vi,w ~ V2} and {v = V2,w = vi} have the same probability, the 
random vertices v and w have the same distribution, so I?„ and D^ have the 
same distribution. 
Let 

,Q. ^ Co-v{Dv,Dw) ^ CoY{Dy,Dy,) 

VVar(i?,)Var(A„) Var(i?,) 

Here G is fixed, and all expectations arc with respect to the random choice of 
(v,w). Thus a(G) is simply the correlation coefficient between the degrees of 
the two ends of a randomly chosen edge, so — 1 < a{G) < 1, and a{G) > 
corresponds to assortative mixing and a{G) < to disassortative mixing. This 
mixing coefficient was introduced by Callaway, Hopcroft, Kleinberg, Newman 
and Strogatz [5D] , building on work of Krapivsky and Redner [32] , and has been 
studied by many people, for example Newman |35| . In [20j . a(G) is denoted 
p{G); we avoid this notation as it clashes with our notation for the survival 
probability of a branching process. 

Fortunately, we need no new theory to evaluate a{G) for G = G{n,K,), since 
a{G) can be expressed in terms of small subgraph counts. More precisely, for 
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any graph G, 



E( 



2e(G)^^'' ' 2e(G)^ '' ' ' e(G) ' 

2 j~i 2 



where i runs over aU vertices of G, then j over all neighbours of i, and di is the 
degree of vertex i in G. Also, 



E 



so 



((D„-l)p,.-l)) = -l-5]^(d,-l)(d,-l) = 



2n(P3,G) + 67i(if3,G) 



2e(G)^f-^' "^ ^ 2e(G) 



r m n ^ r m in n (n(P3, G) + 3n(J^3, G))e(G) - ^(P^, G)^ 

e(G)2 

Also, 

2e(G)E((i^„-l)(i^„-2)) = ^ ^(d,-l)(d,-2) = ^ d,K-l)(d,-2) = 6n(53, G), 

i j^i i 

where 5*3 = ii'1.3 is the star with 3 edges. Thus 

Var(i?„) = Var(A, - 1) = IE((^v - 1)(A, - 2)) + E(Z?„ - 1) - {E{D„ - l)f 
^ MS3. G)eiG) + n(P2, G)e(G) - njP^, G)^ 
e(G)2 

HcncG 

,^. ^ HPz, G) + 3n(A-3, G))e(G) - 7^(P2, Gf 
"^ ^ 3n(53,G)e(G)+n(P2,G)e(G)-7i(P2,G)2- ^ ^ 

In well-behaved cases, for example for bounded kernel families, it follows from 
our results here (Theorems I7.3ff775)) that if G„ ~ G{n,K), then 

a(G„)=a(s) + Op(l), (72) 

where 



with ^(k) = t{K2,K); see (|55l). 

Returning to our present specific example, substituting in the expressions 
for t (•,«;) in Subsection 18.31 the ratio (|73|) turns out to be 

, , 3AP/3f 

a K) = — -; — 7, (74) 

'-' (4C4(/3i/33-/3|) + 2(A + 6PA)e2/32 + 3Ai3/3i)/3f 

where £^ = {A + 3B/3i) = ^(s)//?!- Let us make a few comments on these 
expressions. 
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Firstly, the coefficients i(i^2,s), ^(^2,5), i{K3,K) and i{P3,K) are finite for 
all a > 2, while t{S3,K,) is finite if and only if a > 3. For the numerator, one 
can argue for P3 as for P2 and K3 above to show that the number of exceptional 
copies of P3 is Op{n) and thus negligible for every a > 2, and hence n{P3, G„) = 
t{P'i,K)n + Op(n) by Theorem 17.41 Consequently, the numerator in (j7ip (with 
G = Gn) divided by n^ converges in probability to the numerator in ([75]) . and 
this limit is finite. For a > 3, one can argue in the same way to show that the 
number of exceptional copies of 5*3 is negligible, so n(S'3, G„) = i(5'3, K)n + Op(n) 
and (|7^ does indeed hold. For a < 3, when i(53, k) = 00, Theorem 17.41 impHes 

that n{S3, K)/n — > 00, so in this case a{Gn) ^ = a(ii), for the not very 
interesting reason that Var(Z?„) is unbounded while Cov(I?^, D^) is not. In any 
case, we have shown that (j72p holds in our example for every a > 2. 

Secondly, we see that < a{K) < 00 for every a > 2, with a^K) > whenever 
a > 3 and we add both edges and triangles (i.e., if both A and B are non-zero). 

Thirdly, if A and B are both positive and comparable but very small, then 
it is easy to see that 0(5) is close to 1, for the simple reason that the graph then 
consists of rather few (though still order n) edges and triangles, which are almost 
all vertex disjoint. In this case we almost always have either Dy = _D^, = 1, if 
we pick an edge component, or Dy = Dy, = 2 if we pick an edge of a triangle. 
This is also easily checked algebraically from ([71)1 : the denominator is of the 
form iABjSi + 0{{A + B)^), which is asymptotically equal to the numerator if 
^, S — > with A/ B bounded above and below. It follows that as A and B are 
varied, 0(5) can take any value between and 1, with 1 excluded. 

Finally, it is easy to check that the form of 0(5) as a function of A, B and a 
is very different from that of C2{A, B, a) given in ([69]) . It follows that with the 
degree exponent a > 3 fixed, if wc vary A and B we may vary the clustering 
coefficient and «(«;) independently, subject to certain inequalities. 

It so happens that in the example considered here, «(«;) is always non- 
negative, but it is easy to give examples where a{K,) < 0. Indeed, this arises 
already in the edge-only case (of the kind we treated in [TU]), even with the very 
simple type space with two elements of weights /i{l} = p and /i{2} = g = 1 — p, 
<p <1, taking ^2(1, 1) = 0, ^2(2, 2) = and ^2(1, 2) = A > 0. In symbols, 

K2{x,y) = Al[x^y], 

where 1[£] is the indicator function of the event £. 
For this kernel (family) 

Kcix,y) = 2K2{x,y) = 2Al[x ^ y], 

^(fi) = / ^^2 = 2Apq, 

<^K2{x,y) = 2K2{x,y) = Kc{x,y) = 2Al[x ^ y]. 

Expanding the integrals as sums, it follows that 

i{K2,K) = 2 / '^K2ix,y)dn{x)d^i{y) = C(s) == 2Apq; 
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= 2A^vq- 

f(S'3,K) = ^ / (7K2ixi,X2)<TK2{xi,X3)<7K2{xi,X4) dfl{xi) d^{x2) d^{x3) d^J.{x4) 

= ^A^pq^ + qp^) = -A^pq{p^ + q"^); 

KP^iH) = o / cFK2{xi,X2)(JK2{.X2,X3)aK2{x3,X4)d^i{xi)d^i{x2)d^i{x3)dii{x4) 
= AA^{pqpq + qpqp) ^ SA^p^q"^. 
Substituting these expressions into (|73p and simplifying, wc find that 

A{p - qf 



a^K) 



A{p-q)^ + l' 



Hence a{K) < 0, and we have disassortativc mixing as soon as p 7^ 5, i.e., when 
p € (0, i) U (i, 1). We see also that the coefficient a{K) can be made to take any 
value in (—1,0] by choosing the parameters suitably. 

One can easily combine the simple example above with that considered in 
the bulk of this section to give graphs with power-law degree distributions with 
various values of the clustering coefficient and of a{Gn), now with negative 
values of a{Gn) possible. Perhaps the simplest way of giving such graphs is to 
divide the type space (0, 1] into two intervals h = (0,xo] and I2 = {xq, 1], take 
(fiix) = x~^/" on /i and f{x) = (x — xq)^^^" on I2, to set K2{x, y) = AiLp{x)(p{y) 
if one of x is in /i and the other in /2, and K.2{x,y) = A2'p{x)ip{y) otherwise, 
and to define K3(x,?/,z) to be some constant times (p{x)(p{y)ip{z), where the 
constant depends on how many of a;, y and z lie in /i. 

9 Limits of sparse random graphs 

Although our main focus in this paper was the introduction of the model G{n, k), 
and the study of the existence and size of the giant component in this graph, 
we shall close by briefly discussing some connections to earlier work that arise 
when considering the local structure of G{n,K). 

Let us start by considering subgraph counts. As before, let G consist of one 
representative of each isomorphism class of finite graphs, and let J^ G G consist 
of the connected graphs in G- Given two graphs F and G, let hom(f , G) be 
the number of homomorphisms from F to G, and emb(-F, G) the number of 
embeddings, so emb(F, G) = n(i^, G) aut(F). Writing G„ for a graph with n 
vertices, in the dense case, where G„ has 8(71^) edges, one can combine the 
normalized subgraph or embedding counts 

s{F, G„) = n{F, Gn)/n{F, X„) = emb(F, G„)/ emb(F, Kn) 
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to define a metric that turns out to have very nice properties. (Often one uses 
the equivalent homomorphism densities t{F, Gn) = hom(i^, Gn)/n^^\ but when 
we come to sparse graphs cmbcddings are more natural than homomorphisms.) 
A sequence (G„) converges in this subgraph metric if and only if there are 
constants s{F), F & !F, such that s{F,Gn) -^ s{F) for each F & J^. Lovasz 
and Szegedy [51] characterised the possible Hmits {s{F))p^jr^ both in terms of 
kernels and algebraically. 

Borgs, Chayes, Lovasz, Sos and Vesztergombi [IHl [H] introduced the cut 
metric 5^ that we used in Section 21 They showed that this metric is equivalent 
to the subgraph metric, as well as to various other notions of convergence for 
sequences of dense graphs. One of the nicest features of these results is that for 
every point in the completion of the space of finite graphs (with respect to any 
of these metrics), there is a natural random graph model (called a M^-random 
graph in [34j) that produces sequences of graphs tending to this point. (See also 
Diaconis and Janson |24| . where connections to certain infinite random graphs 
arc described.) 

Turning to sparse graphs, as described in [161 117] . the situation is much 
less simple. When Gn has 0(n) edges, as here, the natural normalization is to 
consider, for each connected F^ 

§(F, Gn) = cmh{F, Gn)/n = aut(i^)n(F, G„)/n. 

Under suitable additional assumptions on the sequences Gn , one can again com- 
bine these counts to define a metric, and consider the possible limit points. 
Unfortunately, not much is known about these; see the discussion in [17j . 

Turning to our present model. Theorem 17. 51 shows that if k is a kernel family 

with only finitely many non-zero kernels and all moments finite, then s{F^ Gn) —^ 
t{F,K) for all connected F, where G„ = G{n, k) and t{F, k) is given by ((53)) . 
This suggests the following question. 

Question 1. Is there a simple characterization of those vectors (i_F)Fejp for 
which there is an integrablc kernel family k such that tp ~ t{F, k) for all F E Tl 

As unbounded kernel families may cause technical difficulties, it may make 
sense to ask the same question with the restriction that k should be bounded. 

Note that Question [1] is very different from the question answered by Lovasz 
and Szegedy [34]: our definition of t{F, k) is different from the corresponding 
notion studied there, since it is adapted to the setting of sparse graphs. In 
particular, if k consists only of a single kernel K2 (as in |34| ) . then we have 
t{F, k) = for any F that is not a tree. 

As discussed in [171 Question 8.1] , it is an interesting question to ask whether, 
for various natural metrics on sparse graphs, one can provide natural random 
graph models corresponding to points in the completion. For those vectors (tp) 
where the answer to Question[T]is yes, the model G{n, k) provides an affirmative 
answer (at least if k is bounded, say). But these points will presumably only 
be a very small subset of the possible limits, so there are many corresponding 
models still to be found. 
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As noted in [T71 Sections 3,7], rather than considering subgraph counts 
s{F, Gn), for graphs with Q{n) edges it is more natural to consider directly the 
probability that the ^-neighbourhood of a random vertex ii is a certain graph 
F] the subgraph counts may be viewed as moments of these probabilities. 

More precisely, let G'^ be the set of isomorphism classes of connected, locally 
finite rooted graphs, and for t > 0, let Ql be the set of isomorphism classes of 
finite connected rooted graphs with radius at most t, i.e., in which all vertices 
are within distance t of the root. A probability distribution tt on G'^ naturally 
induces a probability distribution ttj on each t/[, obtained by taking a 7r-random 
element of G'^ and deleting any vertices at distance more than t from the root. 
Given F G Gl and a graph G„ with n vertices, let pt{F, G„) be the probability 
that a random vertex v of Gn has the property that its neighbourhoods up to 
distance t form a graph isomorphic to F, with v as the root. A sequence (G„) 
with I Gn I —^ oo has local limit tt if 

Pt{F,Gn)-*7:t{F) 

for every F E G\ and all i > 0. This notion has been introduced in several 
different contexts under different names: Aldous and Steele [3| used the term 
'local weak limit', and Aldous and Lyons [3j the name 'random weak limit'. 
Also, Benjamini and Schramm [7] defined a corresponding 'distributional limit' 
of certain random graphs. Notationally it is convenient to map a graph G„ to 
the point (j){Gn) = {pt{F,Gn)) & X = Ylf[OA]^*^ and to define </)(7r) similarly. 
Taking any metric d on X giving rise to the product topology, we obtain a 
metric dioc on the set of graphs together with probability distributions on G'^, 
and (Gn) has local limit tt if and only if dioc(Gn, tt) —> 0. 

As noted in [17] , under suitable assumptions (which will hold here if k is 
bounded, for example), the two notions of convergence described above are 
equivalent, and one can pass from the limiting normalized subgraph counts s{F) 
to the distribution tt and vice versa. Also, if k is a bounded kernel, then the 
random graphs G(n, k) defined in |10| have as local limit a certain distribution 
associated to tt. This latter observation extends to the present model, and as 
we shall now see, no boundedness restriction is needed. 

Given an integrable hyperkernel g, let G^ be the random (potentially infi- 
nite) rooted graph associated to the branching process X^. This is defined in 
the natural way: we take the root of X^ as the root vertex, for each child clique 
of the root we take a complete graph in G^, with these cliques sharing only the 
root vertex. Each child w of the root then corresponds to a non-root vertex in 
one of these cliques, and wc add further cliques meeting only in w to correspond 
to the child cliques of w, and so on. 

More generally, given an integrable kernel family k = {Kp)p,zjr^ we may 
define a random rooted graph G^ in an analogous way; wc omit the details. We 
write 7r„ for the probability distribution on G'^ associated to G„. 

Theorem 9.1. Let k be an integrable kernel family and let Gn ~ G{n,K). Then 

dloc (Ghj-TTk) -^ 0. 
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The proof of this resuh, which may be seen as a much stronger form of 
Lemma 13.21 wiU take a Uttle preparation. 

In fact, wc conjecture that almost sure convergence holds for any coupling 
of the Gn for different n, and in particular if the different G„ are taken to 
be independent. (The case of independent G„ is the extreme case, which by 
standard arguments implies a.s. convergence for every other coupling too; a.s. 
convergence in this case is known as complete convergence.) 

Writing 7r„_t for the probability distribution on Q^ induced by tt^, by defini- 
tion we have dioc(G„, Tr^) ^ if and only if 

Pt{F,G„)^7T^AF) (75) 

for each t and each F £ Ql. The special case where k is a bounded hyperkernel 
is essentially immediate: (j75p is simply a formal statement of the local coupling 
established for bounded hypcrkcrncls in Section [H Exactly the same argument 
applies to a bounded kernel family. For the extension to general kernel families 
we need a couple of easy lemmas. 

Lemma 9.2. Let k be an edge-integrable kernel family. For any £ > there 
is a S ~ Si{k,s) > such that whp any 5n vertices of G{n,K) meet at most en 
edges. 

Proof. This is an extension of Proposition 8.11 of [TU]; the proof carries over 
mutatis mutandis, using Theorem 17.31 with i^ = P2 to bound the sum of the 
squares of the vertex degrees in the bounded case. The key step is to use 
edge intcgrability to find a bounded kernel family k' such that G{n., k') may be 
regarded as a subgraph of G{ji,k) containing all but at most en/2 + Op(n) of 
the edges. D 

It turns out that we can weaken edge intcgrability to intcgrability. The price 
we pay is that we cannot control the number of edges incident to a small set 
of vertices, but only the size of the neighbourhood. As usual, given a set A of 
vertices in a graph G, we write N*{A) for the set of vertices at graph distance 
at most t from A, so A C N{A) = N^{A) cN^{A)---. 

Lemma 9.3. Let k be an integrable kernel family. For any e > there is a 
S = (52 (iS,^) > such that whp every set A of at most Sn vertices of G{n,K) 
satisfies |A^(^)| < en. 

Proof. Replacing each atom by a clique, we may and shall assume that k is a 
hyperkernel. Let /j' be the kernel family obtained from k by replacing each clique 
by a star. Since k is integrable, k' is edge integrable. Let 5i{e) = 6i{k' , e) be the 
function given by Lemma Wl^ and set S = 5i{5i{e)) > 0. Then whp every set A 
of at most Sn vertices of G{n,K') has |A^(A)| < di{e)n and hence |A^^(A)| < en. 
Coupling G{n,K) and G{n,K!) in the obvious way, vertices adjacent in G{n, k) 
are at distance at most 2 in G{n,K.'), and the result follows. D 

Let v{G{n,K)) be the sum of the sizes (numbers of vertices) of the atoms 
making up G{n, k). Our final lemma relates this sum to J k = ^p \F\ /c|fi up. 
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Lemma 9.4. Let k = {Kp)p^jr he an integrable kernel family. Then v{G{n, k)) = 
n/g + Op(n). 

Proof. Let Xr be the number of atoms with r vertices, and X ~ v(G{n, k)) = 
Sr>2^^''' ^^^ c = Jk, and let Cr be the contribution to Jk from kernels 
corresponding to graphs F with r vertices, so E{rXr) ~ {n)rC,r/n^^^ ^ ncr and 
^ Cr- = c < oo. Given e > 0, there is an R such that '^r<R'^r > c — e. Each 
Xr has a Poisson distribution and is thus concentrated about its mean, so whp 

X > \J rXr > 2_. '^rn — en> (c — 2e)n. 

r<R r<R 

Writing (x)^ for max{a:;,0}, since e was arbitrary we have shown that (c — 
X/n)^ -^ 0. Since c — X/n is bounded, it follows that E(c — X/n)+ -^ 0. 
But EX < en, so E{X/n - c)+ = E{X/n - c) + E(c - X/n)+ -^ 0. Hence 
{X/n — c)_|- -^ 0, so X/n -^ c as claimed. D 

Combining the last two lemmas, we can now prove Theorem 19. II 

Proof of Theorem \9.1[ As noted after the statement of the theorem, the case 
where k is bounded is straightforward. 

Let K be an integrable kernel family, and let G„ = G{n,K,). Fix t > 1, 
F £ Ql, and £ > 0. It suffices to prove that 

\ptiF, G„) - ^«,t(i^)| < 6 + Op(l). (76) 

Then letting e — > we have pt{F, G„) — * i^K.t{F), so (|75l) holds. Since t and F 

are arbitrary, this imphes dioc(G„,7rK) -^ 0. 

Applying Lemma 19.31 1 times, there is a (5 > such that whp any set A 
of at most 5n vertices of G„ satisfies |iV*(A)| < en/2. Since k is integrable, 
there is a bounded kernel family k^^ which satisfies g*^ < k pointwise and 
J & ~ /s^^ ^ '^/2- As M -^ oo, we have k*^ /" k pointwise, and it follows 
that TT^M j(F) -^ TTK,,t{F)] the argument is as for Theorem 12. 13r iV Taking M 
large enough, we may thus assume that \t^k.m i{F) — 'n:K,,t{F)\ < e/2. Let G^ = 

G{n,K^). Since k^ is bounded, we have pt(i^, GJ^) ~> T:^Mt{F). Coupling G„ 
and GJj as usual so that G[^ C G„, let B be the set of vertices incident with 
an atom present in G„ but not G'^. By Lemma F9. 41 we have \B\ < Sn whp, so 
whp no more than en/2 vertices are within distance t of vertices in B. But then 
\pt{F, Gn) - Pt{F, G'J\ < e/2 whp, and ^ follows. D 

The general question of which probability distributions on t/'' arise as local 
limits of sequences of finite graphs seems to be rather difficult. There is a 
natural necessary condition noted in different forms in all of [3l [H [7] ; see also 
[TTI Section 7]. Aldous and Lyons [3] asked whether this condition is sufficient, 
emphasizing the importance of this open question. Let us finish with a related 
but perhaps much simpler question: given k, we defined X^ as a branching 
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process in which the particles have types. But in the corresponding random 
graph Gk. these types are not recorded. This means that k cannot simply be 
read out of the distribution of G^, i-e., out of Tr^. This suggests the following 
question. 

Question 2. Which probability distributions on Q'^ arc of the form tt^ for some 
integrable kernel family k? 
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